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Abstract —The development and integration of social networking services and smartphones have made it easy for individuals to organize 
impromptu social activities anywhere and anytime. Main challenges arising in organizing impromptu activities are mostly due to the 
requirements of making timely invitations in accordance with the potential activity locations, corresponding to the locations of and the 
relationships among the candidate attendees. Various combinations of candidate attendees and activity locations create a large solution 
space. Thus, in this paper, we propose Multiple Rally-Point Social Spatial Group Query (MRGQ), to select an appropriate activity location 
for a group of nearby attendees with tight social relationships. We first consider a special case of MRGQ, namely the Socio-Spatial Group 
Query (SSGQ), to determine a set of socially acquainted attendees while minimizing the total spatial distance to a specific activity location. 
We prove that SSGQ is NP-hard and formulate an Integer Linear Programming optimization model for SSGQ. We then develop an efficient 
algorithm, called SSGS, which employs effective pruning techniques to reduce the running time to determine the optimal solution. Moreover, 
we propose a heuristic algorithm for SSGQ to efficiently produce good solutions. We next consider the more general MRGQ. Although MRGQ 
is NP-hard, the number of attendees in practice is usually small enough such that an optimal solution can be found efficiently. Therefore, we first 
propose an Integer Linear Programming optimization model for MRGQ. We then design an efficient algorithm, called MAGS, which employs 
effective search space exploration and pruning strategies to reduce the running time for finding the optimal solution. We also propose to further 
optimize efficiency by indexing the potential activity locations. A user study demonstrates the strength of using SSGS and MAGS over manual 
coordination in terms of both solution quality and efficiency. Experimental results on real datasets show that our algorithms can process SSGQ 
and MRGQ efficiently and significantly outperform other baseline algorithms, including one based on the commercial parallel optimizer IBM 
CPLEX. 

Index Terms —Query Processing, Group Query, Spatial Indexing, Social Networks 
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1 Introduction 

The successful development and integration of social net¬ 
working services and smartphones have driven the recent 
emergence of location-based social networking (LBSN) ser¬ 
vices. Such services, including applications on Foursquare, 
Meetup, Facebook, and Google-i-, allow users to cormect with 
friends, comment on events and places (e.g., restaurants, 
theaters, stores, etc.), and share their happenings and cur¬ 
rent locations. This availability of users' locations and their 
social information allows mobile users to instantly organize 
impromptu social activities anywhere anytime. 

As an LBSN application, an impromptu activity planning 
service needs to account for both spatial and social factors. In 
other words, both the locations and friends considered need 
to be suitable for the activity, i.e., the location should be close 
to the participants so that they arrive in a timely manner, 
and the invited friends should already be acquainted with 
each other to ensure comity. Thus, a major challenge for im¬ 
promptu activity plarming lies in factoring in the distances 
from invitees' current locations to the activity locations, 
along with their shared social connectivity. Note that close 
friends may not be located near a specific activity location, 
while friends near a potential activity location may not enjoy 
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Fig. 1 : Finding friends for impromptu social activity. 

tight social relationships. Moreover, when the number of 
candidate attendees increases, or when the number of ac¬ 
tivity locations grows, selecting the most suitable attendees 
and activity location becomes tedious and time-consuming. 
Therefore, impromptu activity planning would benefit sig¬ 
nificantly from efficient query processing algorithms that 
automatically recommend both attendees and an activity 
location. 

Motivating Example. The interplay of social relationships 
among activity attendees and the activity locations creates 
significant challenges for the organization of impromptu 
social activities. Figure [T] shows a database of 8 candidate 
attendees with three potential activity locations 

Q — {91)92, 93}- The social relationships among the candi¬ 
date attendees are captured as a social graph (shown as 
the social layer in the figure), while the locations of the 
candidate attendees are shown as the spatial layer. Given 
a desired group size, 4 , and a social constraint where each 
attendee can only be unfamiliar with at most 1 other at¬ 
tendee, an approach to select a group and the corresponding 
activity location with minimized total spatial distance is to 
issue a 4 -nearest neighbor ( 4 NN) query on each activity 
location. In the result, we obtain Fi = {v2,V3,V4,V5} with 
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the activity location qi. However, in this case, Fi does not 
satisfy the required social constraint because both V2 and 

are unacquainted with more than 1 other group member. 
Instead, if we focus on social tightness, we obtain group F2 = 
{wi, i>6, "ys} with activity location 52, where each attendee 
is familiar with all the other members. However, this group 
incurs a large spatial distance and thus is not suitable for 
an impromptu activity. In contrast, H3 = {vi,V2,V5,ve} with 
activity location is probably the most suitable solution 
because each attendee in F3 is unacquainted with no more 
than 1 other group member while incurring a small total 
spatial distance to 93. 

In this paper, we propose a new query, namely Multiple 
Rally-Point Social Spatial Group Query (MRGQ), to deter¬ 
mine a suitable activity location and a socially acquainted 
group which minimizes the total spatial distance to the ac¬ 
tivity location. MRGQ seeks a set of most-suitable attendees 
with a corresponding activity location by considering both 
social and spatial factors of impromptu activity planning. 
MRGQ is beneficial for real social network applications (e.g., 
Facebook) and can integrate with group buying websites 
(e.g., Groupon) to provide social-aware location-based ad¬ 
vertisements. We will discuss these issues in Section 12.21 
Here, we assume that the service provider has access to 
the users' underlying social relationships along with their 
current locations. Let G = {V, E) be a social graph, where 
each vertex v € V is associated with a location G, and 
two mutually acquainted vertices u and v are cormected 
by an edge e„^„. Given a set of potential activity locations 
Q = {qi,..,qn}, the plarmed number of activity attendees 
p, the number of unacquainted people each attendee may 
have k, and the maximum spatial distance t (i.e., spatial 
radius) from the chosen activity location to each of the 
selected attendees, MRGQ aims to find a set of p attendees 
from the social graph and an activity location q* from the 
potential activity location list, such that the total distance 
from each attendee to the activity location q* is minimal, 
and the distance from each attendee to the activity location 
q* is bounded by fQ Notice that MRGQ includes a social 
constraint (i.e., k) to ensure the familiarity between each 
attendee, i.e., each attendee can be unfamiliar with at most 
k other people in the selected group. By setting k, the 
coordinator can freely adjust the social atmosphere of the 
activity to accommodate different types of social activities. 
Formally, MRGQ is formulated as follows. 

Problem: Mulfiple Rally-Point Social Spatial Group Query 
(MRGQ). 

Given: A social graph G = (H, E), location G for each 
V ^ V , the number of attendees p, the set of potential 
activity locations Q, the familiarity constraint k, and the 
spatial radius t. 

Objective: MRGQ{p, Q, k, t) finds {E, q*) where E CV, q* € 
Q, such that |F| = p,^y^p is minimafl, < t, and 
unfamiliar{v, F) < \/v G F. 

A straightforward approach for processing MRGQ is to 

1. In most cases a user can specify p and Q according to the motivation 
of the corresponding group activity, such as a "buy three and get one free" 
coupon in a chain restaurant. While it may be more difficult for a user to 
specify the exact values of k and t, one promising way is to let the user 
select the ranges of the two parameters. Accordingly, the algorithm returns 
multiple solutions with different k and t so that the user can choose the 
most desirable one. 

2. is the spatial distance from v to q*. 

3. The number of vertices in F which share no edge with v. 


enumerate all possible groups of p attendees for each activity 
location and eliminate those not satisfying the constraints 
on social familiarity and spatial radius. Then, this approach 
returns the pair of group and activity location whicb incur 
the minimum total spatial distance. This straightforward 
approach needs to enumerate |Q| • candidate pairs of 
groups and locations, entailing an enormous search space. 
Indeed, as we show in the next section, MRGQ is NF-bard. 
However, as the size of p is relatively small in most practical 
impromptu activity scenarios, the problem can be solved 
efficiently. By carefully exploring the social and spatial con¬ 
straints in MRGQ, we develop several processing strategies 
to obtain the optimal solution efficiently. We systematically 
examine the search space to avoid examining all combi¬ 
nations of candidate attendees and the activity locations. 
We incrementally select attendees with the corresponding 
activity location by giving priority to those attendees (i) 
who are close to an activity location, and (ii) who are 
close friends. Qbtaining a group which satisfies both (i) 
and (ii) is non-trivial because an algorithm that addresses 
(i) should simultaneously choose suitable attendees and the 
nearest activity location. However, while achieving (i) may 
quickly obtain a group with small total spatial distance, it 
does not always result in a feasible group that satisfies the 
familiarity constraint. Alternatively, we can address (ii) by 
prioritizing the search for a group of attendees who know 
each other well. However, the group may not have the 
minimum spatial distance to the closest activity location. 
In summary, efficiently processing MRGQ requires carefully 
designed algorithms to select the attendees along with their 
nearby activity location while simultaneously satisfying the 
familiarity constraint. 

To efficiently process MRGQ, we propose to index the 
attendees' locations and the activity locations. In addition, 
we design effective strategies for traversing the search space, 
including Socio-Spatial Qrdering and All-Fair Distance Qr- 
dering, as well as a number of search space pruning rules, 
including Irmer-Triangle Distance Fruning, Quter-Triangle 
Distance Fruning, Activity Location Distance Fruning, and 
Familiarity Frrming, to reduce the processing time. During 
the selection of the attendees and (he activity location, we 
address both the spatial distance among the candidate loca¬ 
tions, and from attendees to activity locations. Meanwhile, 
the social cormectivity of the attendees is also carefully 
explored. As such, we effectively prune redundant search 
space to find the optimal solution efficiently. 

The contributions of this paper are summarized as follows. 

• We identify the organization of impromptu social ac¬ 
tivities as a new social networking application and 
formulate a novel query, MRGQ, to obtain the optimal 
set of invitees and a suitable activity location. MRGQ 
is unique because it specifies the familiarity constraint 
among the invitees. We prove that the problem is NF- 
hard and inapproximable within any factor. 

« We consider a special case of MRGQ, namely SSGQ, 
for considering only a single activity location. We prove 
that SSGQ is NF-hard and propose SSGS with various 
strategies for finding the optimal solution efficiently. In 
addition, we propose a heuristic algorithm for SSGQ, 
namely SSGMerge, which effectively exploits the struc¬ 
tures of intermediate solutions, to obtain good solutions 
in polynomial time. We also propose an Integer Linear 
Frogramming (ILF) optimization model for SSGQ and 
demonstrate that SSGS outperforms ILF. 
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• To efficiently process MRGQ, we propose to index 
the locations of candidate attendees and the activity 
locations and propose an efficient algorithm, namely 
MAGS, which enables various search space traversing 
and pruning strategies to find the optimal solution effi¬ 
ciently. We also propose an Integer Linear Programming 
(ILP) optimization model for MRGQ and demonstrate 
that MAGS outperforms ILP, even if it runs on a com¬ 
mercial integer programming optimizer with parallel 
computation. 

• We conduct a user study with 206 people. The results 
demonstrate that our proposed algorithms significantly 
outperform manual coordination in terms of both solu¬ 
tion quality and efficiency for both SSGQ and MRGQ. 
We also implement SSGQ in Facebook. 

• We evaluate the performance of the proposed algo¬ 
rithms by conducting extensive experiments on real 
datasets. Experimental results manifest that SSGS and 
SSGMerge require much less time than the ILP opti¬ 
mization model with the commercial parallel optimizer 
IBM CPLEX m. Likewise, for MRGQ, MAGS outper¬ 
forms the baseline algorithms in terms of both solution 
quality and efficiency, and is much more efficient than 
the ILP optimization model. 

The rest of this paper is summarized as follows. Section 
|2] analyzes MRGQ and proves that it is NP-hard. Section 
|3] introduces the related works. Section |4] studies a special 
case of MRGQ, namely SSGQ and details the proposed algo¬ 
rithms. Section|5] details the proposed algorithm to efficiently 
process MRGQ. Section [7| shows the results of our user study 
and experiments. Finally, Section [8] concludes this paper. 

2 Problem Analysis and Applications 

An MRGQ includes four parameters, i.e., p, Q, k and t, 
which respectively determine the size of the answer group, 
activity locations, familiarity constraint and spatial radius 
of the query, and all of which have a significant impact on 
processing strategies. First, as the size of group, p, increases, 
the solution space (which consists of all candidate groups) 
grows rapidly. While we prove that processing MRGQ is 
an NP-hard problem and thus very challenging, it can still 
be processed efficiently since the size of p is usually small 
in most practical cases. Second, candidate attendees located 
close to a candidate activity location qi could be prioritized 
for processing, as the search criteria aim to minimize the 
total spatial distance from the selected attendees to qi. As 
the size of Q increases, the search space also grows. Third, k 
dictates the tightness of social relationships among members 
in the invited group. A smaller k in MRGQ indicates that 
candidate attendees with tighter social relationships should 
be given priority. Finally, t reflects the need to avoid selecting 
candidates that are unacceptably far away from the selected 
activity location. These spatial and familiarity constraints 
can be employed for pruning of unqualified candidate 
groups. In the following, we first analyze the hardness of 
MRGQ and fhen discuss concrefe application scenarios for 
MRGQ. 

2.1 Problem Analysis 

We prove fhat MRGQ is NP-hard and inapproximable 
within any factor, i.e., no approximation algorithm exists for 
MRGQ. 


Theorem 1: MRGQ is NP-hard and is inapproximable 
wifhin any factor unless P = NP. 

Proof: We prove that MRGQ is NP-hard with the reduc¬ 
tion from p-clique. Decision problem p-clique, given a graph 
Gc, defermines whether the graph contains a clique, i.e., a 
complete graph of p vertices and wifh an edge connecting 
every two vertices. In MRGQ, let G = Gc, k = 0, t = oo, 
Q = {q} and g = 1 for every vertex v G V. We first prove 
the necessary condition. If Gc confains a p-clique, fhere must 
exist a group with the same vertices in the p-clique such 
that every person has social relationship with all the other 
attendees in the group, and the total spatial distance is p. 
We then prove the sufficient condition. If G in MRGQ has 
a group of size p and k — 0, Gc in problem p-clique must 
contain a solution of size p, too. Therefore, MRGQ is NP- 
hard. 

We prove the inapproximability of MRGQ with a gap- 
introducing reduction from fhe p-clique problem. Given a 
graph Gc, the decision problem p-clique determines whether 
the graph contains a clique of size p, i.e., a complete graph 
of p vertices wifh an edge connecfing every two vertices. 
For any instance of the p-clique problem in graph Gc, we 
construct an instance of MRGQ as follows. The input graph 
of MRGQ, G, is constructed by adding a complete graph Kp 
with p vertices to Gc, i.e., G = Gc U Kp, where each vertex 
V G Kp cormects to every vertex u G Gc. We set Q — {q}, 
where q is any spatial object, and the spatial distance from 
each verfex w G Gc to g is set to 1, i.e., du,q = l,Vu G Gc. By 
contrast, the spatial distance from each vertex v G Kp to q 
is set to an arbitrary value I much larger than p, i.e., dy,q = 
l,\/v G Kp. Moreover, k = 0 and f = oo in MRGQ. Now, if 
there is a p-clique in Gc, there exists a feasible solution of 
MRGQ, i.e., F C Gc in G, with the total spatial distance as 
— P F Kp = 0). If no p-clique exists in Gc, 
MRGQ has at least one feasible solution, such as Kp, but it 
is not possible to extract a feasible solution from Gc alone. 
Therefore, the optimal solution F returned by MRGQ must 
include at least one vertex in Kp with a total spatial distance 
of YveF > (p-l-iO > P (i-Ov F\^Kp 0). MRGQ carmot 
be approximated within any factor smaller than {p—\ + l)/p-, 
otherwise, the approximation algorithm could solve the p- 
clique decision problem since it can distinguish the two cases 
in MRGQ. Since I can be set as an arbitrary value much 
larger than p, MRGQ carmot be approximated within any 
ratio. The theorem follows. □ 

We also propose an Integer Linear Programming (ILP) 
optimization model for MRGQ which, via a commercial 
solver, such as CPLEX [IJ, can obfain fhe optimal solution. 
We first define a number of decision variables in fhe ILP 
formulation. Let binary variable fu denote whether vertex 
u is in F. Let binary variable denote whether activity 
location q is chosen in the solution. When u is an attendee 
and thus joins F, let integer variable denote the number 
of atfendees in F not acquainted with u, pu > 0. Let variable 
6u denote the distance from u to the activity location if u is 
selected in F, 6u > 0; otherwise, 6u — 0. The problem is to 
minimize the total spatial distance from the selected activ¬ 
ity location to the attendees, i.e., rninYuevYqeQ'^q^'udu.q- 
However, this simple formula does not serve well as the 
objective function because it is not linear. On the other 
hand, the formula, min Yugf does nof serve well 

as fhe objective function since F is unknown. Therefore, we 
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formulate the objective function of MRGQ as follows. 

min ^ Su- 
uev 

This objective function can correctly find ouf fhe total spatial 
distance from the selected activity location to the attendees 
since only (5„ of each atfendee u in F will be assigned a non¬ 
zero value, as shown in the constraint (9) detailed later. In 
other words, 6u will be 0 in the objective function if u is nof 
an atfendee. 

The ILP formulation for MRGQ is equipped with the 
following constraints. 


'^uGV “ T’ 

'^q = 1) (B) 

{p -!)(/)„- < Pu, Vu G G (C) 

E«ev < kp, (D) 

du,q{4>u + TTq - 1) < 5u, yu € V,Vq e Q (E) 
Su <t, \/u €V (F) 


In the above, constraint [A) guarantees that exactly p 
vertices are selected in solution set F, while constraint {B) 
states that only one location is selected for the activity. 
Constraints (C) and (D) specify fhe familiarify condifion. 
Specifically, if u parficipafes in F, i.e., (j)u = 1, this constraint 
becomes {p — i) — J2 v&Nu ^ tJ-u- In other words, the left- 
hand-side (LHS) of constraint (C) is identical to the number 
of aftendees in F not knowing u, and constraint (D) enforces 
that the total number of unfamiliar attendees not to exceed 
kp. 

Constraint (E) assigns as du,q if u and q are chosen 
as an atfendee and the activity location, respectively. More 
specifically, (pu and are bofh 1 in fhis case, and consfrainf 
{E) fhus becomes du,q < Su- Since fhe objective function is 
a minimization function, Su will be assigned as du,q in the 
optimal solution. On the other hand, if u is not an attendee, 
or if q is not the activity location, constraint (E) becomes 
0 < Su, and thus non-restrictive to Su- Therefore, Su will be 
0 in the objective function if u is not an attendee. Constraint 
(E) ensures that the spatial distance from each aftendee fo 
the activity location not to exceed spatial radius t. 

We have the following observafions from fhe above con¬ 
straints. 

1) Constraint (C) cannot be substituted with (p —!)(/)„ — 
~ Otherwise, if u does not join F, i.e., 
(pu = 0, this constraint becomes — = Pu- 

Therefore, constraint (D) cannot correctly sum up the 
number of unfamiliar attendees in F, because it consid¬ 
ers every person u in V. To address this issue, an ap¬ 
proach is to replace constraint (D) with ^ugf ^ kp, 
such that only the attendees in F will be considered. 
However, constraint {D) in this case becomes non¬ 
linear because the set F also needs to be decided too. 
In contrast, the proposed constraints (C) and [D) can 
effectively avoid the above issue. When pu = 0, con¬ 
straint (C) becomes — ^ugn < Pu, which allows 
Pu to be 0 for consfrainf (D), such fhat we are able fo 
sum up pu of every person in V, even when u is nof 
in F. Nofe that pu is also allowed to be assigned larger 
than the LHS of constraint (C). However, if consfrainf 
{D) still holds when (p - 1) - Y,veNu h 

guarantees that assigning = (p - 1) - Y,vgn 4>v 
also leads to a solution that does not contradict (C), 
because the LHS of (C) becomes smaller in this case. 


Therefore, fhe familiarity condition can be enforced 
with the design of pu together with constraints (C) 
and {D). Similarly, constraint {E) cannot be replaced 

with du,q{pu +TTq-l)= Su- 

2) The complexity of this formulation (correlated to the 
number of infegral decision variables) can be signif¬ 
icantly reduced by relaxing the integrality constraint 
that enforces pu to be a non-negative integer. In this 
case, Pu can be any non-negative real number, and 
the number of integer variables in this formulation are 
significantly reduced. This formulation in this case is 
still correct because pu m the objective function still 
needs to be an integer variable. In addition, for any 
solufion with pu not an integer number, replacing pu 
with the largest integer number not exceeding pu must 
also be a feasible solufion, since the LHS of consfrainf 
(C) needs fo be an integer number. 

2.2 Application Scenarios 

We discuss the reasons why MRGQ is beneficial for real 
social applications, such as Facebook and Groupon. 

1) The initiator is a person included in the solution group. 
The proposed MRGQ can be employed in various online 
social network applications, e.g., Facebook, to initiate im¬ 
promptu activities. Facebook's Event function allows a user 
to initiate an activity by specifying the location and invitees. 
However, it may be difficult for fhe initiafor fo select a set 
of invitees with tight social relationships in real time, and 
the multiple candidate locations, e.g., branches in a popular 
chain restaurant, may make it difficult for fhe inifiator to 
manually select a suitable location and the corresponding 
attendees. If MRGQ can be integrated with Facebook, the 
initiator only needs to specify a set of candidafe activity 
locations along with the query parameters to quickly identify 
fhe invitees and a suitable activity location. 

2) The initiator is not a person and thus not included 
in the solution group. In addition, deal-of-the-day services 
such as Groupon, can also benefit from MRGQ. Currenfly, 
Groupon recommends offered deals (e.g., coupons) fo users 
according to their preferences or purchase histories. To take 
advantage of a given deal, a customer may need to organize 
a certain number of friends (e.g., "buy three get one"), and 
may be less inclined to buy the coupon if identifying a 
likely group poses difficulty. To address this issue, Groupon 
can exploit MRGQ to provide social-aware location-based 
advertisement. For example, to promote a chain restaurant, 
Groupon can identify groups wifh tighf social relationships 
and thus identify branches suitable for each group. The 
social recommendation can be attached in the location- 
based advertisement to increase the chance of the customer 
purchasing the coupon. In this case, Groupon is an initiator 
not included in the solution group. 

3 Related Work 

Some LBSN applications, e.g.. Meetup, have been available 
for activity coordination for some fime. However, fhey are 
designed mainly for periodical meefings, e.g., a reading club 
or a user group for 3D prinfing. In this paper, we emphasize 
the scenarios of imprompfu social acfivifies where fhe time 
and effort for organizing an activity need to be minimized. 
As manual identification of candidafe attendees, a common 
pracfice foday, is fedious and fime-consuming, we argue 
and show in this paper that, MRGQ is very useful for such 
scenarios as it recommends a group of suitable attendees 
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and an activity location by taking both the social and spatial 
factors into account. 

Researches on finding groups of socially connected mem¬ 
bers, e.g., team formation S3|l4], community search |5|, 
Social-Temporal Group Query [8| and Circle of Friend Query 
[9], have been reported in the literature. Nevertheless, their 
research context and objectives are totally different from 
our research goal, i.e., exploring both the spatial and social 
dimensions in finding a group of friends and a location for 
an impromptu activity. Specifically, team formation [H|4| 
finds a group of experts with the required skills, while 
aiming to minimize the communication cost between these 
experts. Community search |0 finds a compact community 
that contains particular members, aiming to minimize the to¬ 
tal degree in the community. Social-Temporal Group Query 
f5| checks the available times of attendees to find the group 
with the most suitable activity time. Circle of Friend Query 
f91 finds a group of friends by considering their social and 
spatial properties. The friends are not grouped to specific 
activity locations because no activity location is given in this 
query, and this query thus is not suitable for impromptu 
activity planning. 

Relevant to our work, spatial queries for selecting a set of 
spatial points, aiming to minimize the total spatial distance, 
have been proposed for various scenarios IH, [7J, ITOl , IflTI . 
However, in these works, the (social) connectivity among the 
spatial points is not considered. Specifically, given two sets 
of points P and Q, together with the number of points to 
be selected k, Group Nearest Neighbor Query [61 finds a set 
of k points in P such that the total spatial distance of the 
points to all points in Q is minimized. Qn the other hand, 
for a line segment and a set of points. Continuous Nearest 
Neighbor Search (7| returns the nearest neighbor of each 
point on the line segment. Meanwhile, Continuous Visible 
Nearest Neighbor Queries ITOl and Continuous Qbstructed 
Nearest Neighbor Query [TTI extend Continuous Nearest 
Neighbor Search [7] by incorporating the obstacles in the 
problem designs, which may affect the visibility or distance 
between two points and lead to different results. Therefore, 
the above-mentioned queries focus only on the spatial di¬ 
mension and thereby are not applicable to our scenario of 
LBSN applications. 

To the best knowledge of the authors, researches on find¬ 
ing groups that consider constraints in both the spatial and 
social dimensions just started. Qur work examines the inter¬ 
play in both social and spatial dimensions, with an objective 
fo find a group of mutually familiar attendees such that the 
total spatial distance to an activity location is minimized. We 
envisage that our research result can be employed in various 
LBSN applications for group recommendation. 

4 Socio-Spatial Group Query (SSGQ) 

The challenges for processing MRGQ lie in the interplay of 
social and spatial dimensions, along with the large solution 
space. In this section, we first consider a relaxed version 
of MRGQ with single activity location, i.e., Socio-Spatial 
Group Query (SSGQ). We formulate SSGQ and propose 
an Integer Linear Programming (ILP) optimization model 
for SSGQ, which acts as a baseline for comparison with 
the proposed algorithms for SSGQ. We then propose an 
algorithm, called SSGS, to efficiently process SSGQ. We also 
propose a heuristic algorithm for SSGQ, namely SSGMerge, 
to find good solutions very efficiently. 

Specifically, SSGQ is formally defined as follows. 
Problem: Socio-Spatial Group Query (SSGQ). 


Given: A social graph G = (V, E), location G for each v gV, 
and an SSGQ{p, q, k, t) where p is the number of attendees, 
q is the activity location, k is the familiarity constraint, and 
t is the spatial radius. 

Objective: To find a set F CV where \F\ = p and minimize 
the total spatial distance from F to q, i.e., dy^q, where 

dy^q < t,Vr) £ F, and un far miliar {v, F) < /q. Vz; £ F. 

Theorem 2: SSGQ is NP-hard. 

Proof: We prove that SSGQ is NP-hard with the re¬ 
duction from p-clique. Decision problem p-clique is given a 
graph Gc to find whether the graph contains a clique, i.e., a 
complete graph with an edge cormecting every two vertices, 
with p vertices. In SSGQ, we let G = Gy, k = 0, t = oo, and 
dv,q = 1 for every vertex v G V. We first prove the necessary 
condition. If Gy contains a p-clique, there must exist a group 
with the same vertices in the p-clique such that every person 
has social relationship with all tbe other attendees of the 
group, and the total spatial distance is p. We then prove the 
sufficient condition. If G in SSGQ contains a group with the 
size as p and k as 0, Gy in problem p-clique must contain a 
solution with size p, too. The theorem follows. □ 

In the following, we present an Integer Linear Program¬ 
ming (ILP) optimization model for SSGQ. We first define a 
number of decision variables in the formulation. Let binary 
variable fu denote whether vertex m is in F. When u joins F, 
let integer variable denote the number of attendees in F 
not acquainted with u, /i„ > 0. The problem is to minimize 
the total spatial distance from each vertex in F to g, i.e., 

min ^ dufu 
uev 

s.t. 

Eusv^u=P, yuGV (G) 

dufu <t, \/uGV (H) 

(p-1)(^„ < Pu, VuGV (I) 

Euev Pu < kp. (J) 

In the above, constraint (G) guarantees that exactly p 
vertices are selected in solution set F, while constraint 
(F) ensures that the spatial distance from each selected 
attendee to q does not exceed spatial radius t. Constraints 
(/) and (J) specify the familiarity condition. Specifically, if 
u participates in F, i.e., fu = 1, this constraint becomes 
< Pu- Irr other words, the left-hand-side 
(LHS) of (I) is identical to the number of attendees in F not 
knowing u, and constraint (J) enforces that the total number 
of unfamiliar attendees must not exceed kp. 

We make the following observations from the above con¬ 
straints. 

1) Constraint (/) carmot be substituted with (p — 1) — 

SugAT ~ Tu- Qtherwise, if u does not join F, i.e., 
fu = 0, this constraint becomes — X]«gAr = Pu- 
Therefore, constraint (J) carmot correctly sum up the 
number of unfamiliar attendees in F, because it consid¬ 
ers every person rt in G. To address this issue, an ap¬ 
proach is to replace constraint (J) with Jf^^p < kp, 
such that only the attendees in F will be considered. 
However, this constraint in this case becomes non¬ 
linear because the set F also needs to be decided 
too. In contrast, the proposed constraints (!) and (J) 
can effectively avoid the above issue. When fu = 0, 
constraint (/) becomes — X^ugiv^ 4’v < Pu, which allows 

4. The average number of vertices in F sharing no edge with v. 
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to be 0 for constraint (J), such that we are able 
to sum up of every person in V, even when u is 
nof in F. Note that is also allowed to be assigned 
larger than the LHS of consfrainf (/). However, if 
consfrainf (J) sfill holds when {p — i)—J2veN 
it guarantees that assigning = (p - 1) - 4'v 

also leads to a solution that does not contradict (J), 
because the LHS of (J) becomes smaller in fhis case. 
Therefore, fhe familiarity condition can be enforced 
wifh fhe design of fogether with constraints (/) and 
(J). 

2) The complexity of this formulation (correlated to the 
number of integral decision variables) can be signif¬ 
icantly reduced by relaxing the integrality constraint 
that enforces to be a non-negative integer. In this 
case, pu can be any non-negative real number, and 
the number of integer variables in this formulation are 
significantly reduced. This formulation in this case is 
still correct because (/>„ in the objective function still 
needs to be an integer variable. In addition, for any 
solution with not an integer number, replacing 
with the largest integer number not exceeding must 
also be a feasible solution, since the LHS of constraint 
(/) needs to be an integer number. 

4.1 Algorithm Design for SSGQ 

Despite only considering a single activity location, process¬ 
ing SSGQ is still challenging since we need to account 
for the interplay between both social and spatial factors, 
which necessitates a systematic approach for group forma¬ 
tion. Therefore, in this section, we propose an algorithm, 
called SSGS, to efficiently process SSGQ. SSGS adopts a 
branch-and-bound group formation process to form feasible 
groups, i.e., fhose thaf consisf of p members and satisfy 
the query constraints. The basic idea is to maintain an 
intermediate group Sj and incrementally add a candidate 
member from the remaining set of candidafes, Sr, based 
on some ordering sfrategies to traverse the space of group 
formation. Given a candidate attendee set V and the activity 
location q, SSGS initializes Sj = 0 and Sr as the candidate 
attendees within the spatial radius of q. At each subsequent 
iteration, SSGS moves a candidate attendee from Sr into 
Si until Si becomes a feasible solution. If Si is disqualified 
during the process, SSGS backtracks to the previous step 
to choose another candidate attendee from Sr. When Si 
becomes feasible, SSGS saves it as the current best solution 
and backtracks to previous step to continue finding better 
groups. Qbviously this process is slow, so the key issue is 
how to devise a traverse ordering strategy to quickly find a 
feasible group and devise effective rules to prune redundant 
groups. 

One approach is to use an R-tree which indexes the 
locations of candidates to provide guidance, and select a 
candidate from Sr with the shortest spatial distance to the 
activity location, which is referred as Distance Ordering. As 
such, we can use the spatial properties derived via the max¬ 
imum bounding rectangles (MBR) in the R-tree and the con¬ 
straints of SSGQ to prune unqualified candidates and thus 
reduce the search space. Another approach aims to quickly 
form a feasible group with small total spatial distance to 
the activity location for distance-based pruning adopting 
a Socio-Spatial Ordering, which prioritizes the growth of 
an intermediate group based on its social tightness. Recall 
that Distance Ordering first expands Si with the individuals 
closest to the query point q. For example, consider Figure 


|2(a) as the input social graph (the number besides each 
node indicate s the spatial distance to q), where p = 3 and 
k = 0. Figure |2(b)| presents the expansion of Si with only 
Distance Ordering, and the number besides each node in the 
branch-and-bound tree represents the expansion sequence. 
As shown in Figure |2(b)| the expansion sequence of these 
nodes is sorted according to the spatial distance to the query 
point. The leaf nodes in the branch-and-bound tree (i.e., the 
groups of p individuals) can be created according to the 
total spatial distance, i.e., a group with a smaller total spatial 
distance is generated earlier. However, employing only the 
Distance Ordering strategy is not always good because it 
ignores the social constraint of the generated groups. As a 
result, most groups generated at the early stage, e.g., {a, b, c}, 
{a, b, d}, {a, b, e}, and {a, b, /}, do not satisfy the familiarity 
constraint (i.e., k = 0) even though they are the top-4 groups 
with the smallest total spatial distances. 

To address the weakness of Distance Ordering, we com¬ 
bine the social connectivity and spatial distance to identify 
an intermediate group to be expanded in the next step. 
Intuitively, when an individual v is chosen by Distance 
Ordering, we move it into Si only when v also satisfies the 
social condition specified in Eq. |[T]|. This social condition 
ensures that Si together with v leads to a group with the 
attendees familiar with each other. If v does not follow the 
above social condition, we find another individual u with 
Distance Ordering that satisfies the social condition. As such, 
both spatial and social factors are taken into account in 
Socio-Spatial Ordering. 

More specifically, to ensure that the social cormectivity of 
each selected individual v to the vertices in Si is good, a 
simple approach is to ensure that v can be selected only 
when the number of edges between v and the vertices 
in Si exceeds a given threshold. With a larger threshold, 
a candidate attendee that is familiar with more attendees 
currently in Si is inclined to be chosen. Nevertheless, pa¬ 
rameter k is not examined for the current attendees in Si 
when V is added. Consequently, some attendees in this 
case may not have a sufficient number of neighbors in Sr 
By contrast, SSGS selects v only when v satisfies Eq. ([T]l. 
Specifically, as Eq. |[T]| assumes that v is added to Si, SSGS 
examines whether the social cormectivity of the new group 
Si U {u} is sufficient according to the criterion k. Let F(Si) 
denote the average number of acquainted members in Si, 
i.e., F{Si) = where Ny is the set of 

neighbors of v in V. Individual v is added to Si if it satisfies 
Eq. (ll) as follows, 

F{SiU{v})>\SiU{v}\-^^^^^^:^^-l ( 1 ) 

p-1 


where 0 here is a d 5 mamically adjusfed parameter and set 
as k initially. Intuitively, when k = p — 1, the activity allows 
all attendees to be mutually unfamiliar. In this case. Distance 
Ordering is the best strategy. In fact, Eq. llT]| in this situation 
becomes F{Si U {f}) > — 1, and Socio-Spatial Ordering here 
is identical to Distance Ordering. In another extreme case 
where k = 0, Eq. ill) becomes F{SiLi{v}) > |S'/U{u}| — l, im¬ 
plying that each attendee in S'/ U {r;} needs to be acquainted 
with all the others in S/ U {/;}. 

It is worth noting that Eq. |(T) incorporates the dynamically 
adjusted parameter 6. Instead of including k directly, it prop¬ 
erly handles other cases with 0 < k < p—1. When fc = 0, if no 
vertex from Sr satisfies Eq. ([T), it is not necessary to add any 
individual v from Sr to S/ because every solution growing 
from Si U {/;} does not follow the familiarity constraint. 
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ing. 


Fig. 2: Example of ordering strategies. 

When A: > 0, if no individual from Sr satisfies Eq. |(l]l, it does 
not imply that every solution growing from S'/U{?;} does not 
have sufficient social cormectivity. In contrast, it is possible 
to find an individual v in Sr and a solution growing from 
Si U {r!} when other vertices added later bring a sufficient 
number of edges to the solution. Therefore, for k > 0, Socio- 
Spatial Ordering sets 6 as k initially and increases 9 if no 
vertex from Sr can satisfy Eq. |[lj, until at least one vertex 
follows Eq. l[l]| and thereby is able to be selected for Si. 
Notice that Eq. (U) first maintains a high criterion for the 
social cormectivity by setting 9 as k, in order to prioritize a 
vertex leading to sufficient social cormectivity. if no vertex 
from Sr can satisfy such a high criterion, Eq. |(T]| increases 9 
to avoid filtering out any feasible solution. Thus, any vertex 
in Sr that did not satisfy Eq. |[T]| previously will be examined 
later with a large 9 accordingly. 

Eigure |2(c)| presents an illustrative example of Socio- 
Spat ial Ordering with p = 3 and k = 0 for the graph in Eigure 
2(a) The exploration of Socio -Spatial Ordering is shown as 
the solid line in Eigure |2(c)[ In this example, 9 = 0 and 
Si = (j) initially. Since a is the vertex with the minimum 
spatial distance to q, and F{(j3U{a}) = j > 1 — 1 — ^ satisfies 
Eq. ®, SSGS moves verfex a from Sr to Si first and lets 
Si = {a}. However, F{Si U {b}) = |<2 — 1 — ^ does not 
satisfy Eq. ([T). Therefore, SSGS examines vertex c and finds 
ouf fhat F{Si U {c}) = i- 2= l>2 — 1 — satisfies Eq. ([T). 
Therefore, vertex c is moved into Si, and now Si = {a, c}. 
We then expand Si by choosing vertex d, and Si = {a, c, d} 
now is a feasible solution. In contrast. Distance Ordering 
selects vertex b after vertex a (as shown in the dashed- 
line in Eigure |2(c)} and then sequentially constructs four 
intermediafe groups {a, 6, c}, {a, b, d}, {a, b, e}, and {a, b, /}. 
Unforfunately, none of these meets the familiarity constraint. 
As shown, this example illustrates that it is desirable to 
jointly consider spatial and social domains in order to find 
a feasible solution for SSGS earlier, because the obtained 
feasible solution is a key factor for the pruning strategy 
introduced below. 


4.2 Pruning Strategies for SSGS 

We also propose two pruning rules, namely Eamiliarity 
Pruning and Distance Pruning, which effectively filter out 
unqualified intermediate groups. The idea of Eamiliarity 
Pruning is to derive an upper bound on the number of 
acquaintances each member may have after new members 
are included into Sr Similarly, Distance Pruning identifies 
a lower bound on the total spatial distance of each group 
grown from Sr SSGS sfops processing Si and backtracks if 
fhe current Si is pruned by Familiarity Priming or Distance 
Pruning. 

Familiarity Pruning. Specifically, the edges in any solution 
growing from Si can be divided into three categories: 1) 
Ei: the set of edges connecting any two vertices in Si, 2) 


Er : the set of edges cormecting any two vertices selected 
from Sr, and 3) Eir: the set of edges connecting any two 
vertices in Si and the vertices selected from Sr. Apparenfly, 
\^l\ = 5 SueSr |iV.^|, where iV.^ is fhe set of acquainted 
neighbors of v in Si. Since the selected vertices in Sr are 
not clear, a good way is to find an imper bound on \Er\, i.e., 
i(p- |S'/|) max„gs^ |A^|, where is fhe set of acquainted 
neighbors of v in Sr. It is an upper bound because the vertex 
with the maximum degree in Sr is identified, and {p— | S'/1) 
vertices are selected from Sr. Similarly, an upper bound on 
\Eir\ = E„GSr \InterEdge{v)\, where InterEdge{v) is the 
set of edges connecting v in Si to any vertices in Sr. 

Notice that the number of edges in a feasible solution 
is half of fhe total degree of all fhe vertices in the solu¬ 
tion. Therefore, with tbe above three categories of edges. 
Familiarity Pruning stops processing Si when the following 
condition holds, 

p ^-^veSi vGSr 


+2- ^ \InterEdge{v)\ 

veSi 


< {p — k — 1). 


( 2 ) 


In the above inequality, the left-hand-side is an upper bound 
on the average number of attendees acquainted to each per¬ 
son in any feasible solution growing from Si. The condition 
states that, on average, each attendee is acquainted with 
fewer than p — k — 1 other attendees. Familiarity Pruning 
stops processing Si and backtracks if solutions growing from 
Si via the exploration of Sr do not satisfy tbe familiarity 
constraint. _ 

For the social graph in Figure [2(a)| with p = 3 and fc = 0, if 
Si = {6, d} and Sr = {e, /}, SSGS stops processing Si and 
backtracks because |(0-l-l-l-|-2-l) = 1 < (3 — 0 — 1) = 
2. In other words, moving any vertex from Sr fo Si will 
never generate a feasible solution following the familiarity 
constraint. 

Distance Pruning. For a given Si, p — liS/l vertices must 
be selected from Sr fo Si. Apparently, further processing of 
Si is unnecessary if Si and the p — IS/] vertices with the 
shortest spatial distances to q have a total distance larger 
than D, where D is the best solution value obtained so far. 
Therefore, Distance Pruning identifies a lower bound and 
sfops processing Si when the following condition holds, 

„ ^u,q + (P ~ |b'/|)di;n,i„,g > D (3) 

where the first term is the total spatial distance from the ver¬ 
tices in Si to g. For Sr, only the vertex nmin with the smallest 
spatial distance to q is accessed here, and {p — |S'/|)d«„i„,g 
represents a lower bound on the total spatial distance for 
the above p— [S'/! vertices in Sr. 

Consider the social graph in Figure |2(a)| with p = 3 as 
an example. After a feasible solution {a, c, d} is explored, 
its total spatial distance 27 is assigned to D. When SSGS 
considers Si = {a, b} and Sr = {e, /}, since J^ueSi du,q + {p- 
|5'/|)dt,mi„.q = 11-I-1-19 = 30 > 27, Distance Priming removes 
states {a, b, e} and {a, b, /}, stops processing Si = {a, b}, and 
backtracks to the previous state accordingly. 


4.3 Heuristic Aigorithm for SSGQ 

As proved earlier, processing SSGQ is an NP-hard problem. 
In fact, even when the spatial distance is the same for every 
candidate, the problem is still NP-hard due to the familiarity 
constraint required to address. Therefore, in the following. 
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Fig. 3: Example of SSGMerge. 


we propose an efficient heuristic algorithm to obtain good 
solutions very efficiently. SSGS employs the branch-and- 
bound framework to incrementally improve the solution 
and find the optimal solution. A straightforward approach 
to develop heuristic algorithms for SSGQ is to stop the 
branch-and-bormd search after the i-th feasible solution is 
obtained. However, it suffers two main drawbacks: 1) the 
running time is still not constrained in pol 5 momial time, and 
2) this approach only maintains the current minimal total 
spatial distance for solution space pruning but ignores the 
possibility to exploit many intermediate solutions to further 
improve the efficiency. To effectively address the above two 
issues, we propose an algorithm, named SSGMerge, which 
effectively utilizes the structures of intermediate solutions 
to generate a good feasible solution in polynomial time. The 
idea is to iteratively merge good socially tight groups with 
small spatial distances in different intermediate solutions 
obtained in earlier iterations. 

Figure |3] presents a social network and a snapshot of the 
branch-and-bormd tree with p = 4, k = 1, and 0 = 1 
for Socio-Spatial Ordering. After the first feasible solution 
{a, c, d, e} is obtained, {a, d} and {b, d} are pruned accord¬ 
ingly by Distance Pruning, while {a, c,d,f} and {a, c, e, /} 
have higher total spatial distances than {a,c,d,e}. If the 
straightforward heuristic approach stops here, {a, 6, c, d} and 
{a,b,c,e}, both extending from {a, 5} and enjoying smaller 
total spatial distances, unfortunately are not to be discov¬ 
ered. In contrast, since {a, 6, d} and {a, c, d} incurs small 
spatial distances, and if other candidates later join these 
two groups, they will become socially dense groups with 
small spatial distances. A promising idea is to merge the 
two groups into {a, 6, c, d} (similarly, merging {a,b,c} and 
{a, c, e} results into {a, b, c, e}). Based on this idea, we design 
a systematic approach to choose a set of suitable groups for 
constructing a good feasible solution. 

The intermediate solutions expanded according to Socio- 
Spatial Ordering are created and tailored for each query with 
the specific parameters and the activity location. Therefore, 
it is more efficient for SSGMerge to process intermediate 
solutions directly. Given the group size p of SSGQ, we 
maintain a set of p intermediate solution queues {Ui, ...,Up}, 
where each element in Uj is an intermediate solution Sj with 
j attendees. To prioritize the intermediate solutions in U\Si\ 
with high social tightness and small spatial distance, we sort 
the intermediate solutions in U\Si\ with a ranking function 
R{-) based on Socio-Spatial Ordering, 

R{Si) = p-t-0 + ^ dy^q 

vGSi 


TABLE 1: Intermediate solutions maintained and con¬ 
structed. 


Maintained and constructed solutions 

Ui 


U 2 

{a, cj, {a,b} 

U 3 

{a, c, d},{a, c,e},{a, b,c} 

Ui 

{a, c, d, e},{a, b, c, d},|a, b, c, e| 


where 9 > k is set to the minimum 9 that Sj satisfies Socio- 
Spatial Ordering, and dy^q is the spatial distance from a 
candidate v to the activity location q. The ranking function 
ranks Sj based on its 9 value and the total spatial distance 
to q, and it gives a smaller score to the Sj which has tighter 
social relationship and closer to q. Consider an example 
with the social graph shown in Figure |31 Assume p = 4, 
k = 1, t = 100, Si = {a, &, c}, and Si = {c,d,e}. Then, 
i?(S/) = 4-100-2-^6 = 806 while R{Si) = 4H00-1-M2 = 412. 
Therefore, SSGMerge is inclined to choose Si. 

Given the set of p intermediate solution queues 
{C/i, ..., Up}, the basic idea is to merge different pairs of small 
groups into larger ones. That is, for each Ui, we merge each 
pair of small intermediate groirps Si, Si € Ui into a new 
intermediate solution Si, i.e.. Si = Si Li Si, and store Si 
in the corresponding U^g^^. If there are more than A inter¬ 
mediate solutions in Ui, after inserting merged intermediate 
solutions, Ui maintains the A intermediate solutions with the 
smallest ranking value according to R{-). In other words, 
A here is a filtering parameter for controlling the quality 
and the number of the intermediate solutions in each Ui. 
Therefore, by first setting i as 1 and increasing i by 1 at each 
iteration, we can incrementally construct new intermediate 
solutions. Finally, we extract the feasible solution which 
incurs the minimum spatial distance from Up and return it 
as the solution. 

More importantly, SSGMerge employs a pruning strategy 
to reduce the number of intermediate solutions under exam¬ 
ination. When SSGMerge merges the intermediate solutions 
in Ui, an intermediate solution Si can be discarded if the 
following condition holds, 

Y] dy^q + {p- IS/I) min fij > D. 

2<J<p—1 

vGSi 

In the above condition, pj is the minimum spatial distance 
of the candidates existing in Uj, and D is the currently 
best solution value. It rrmasures the minimum increment of 
the spatial distance of Si when Si is merged with others 
and becomes a feasible solution. If this condition holds, any 
feasible solution expanded from Si (i.e., having Si as a 
subset), will never become a better solution, and thus Si 
can be safely discarded. 

In Table [TJ the merged solutions constructed by SSGMerge 
are shown in bold with underscores. For example, {a, 6} is 
constructed by merging {a} and {&} in Ui, and {a, b, c, d} can 
be constructed by merging {a, c, d} and {a, b, c} in U3, where 
{a, 6 , c} is the combination of {a,c} and {a,b} in U2. After 
the merging process is completed, we extract the feasible 
solution with the minimum spatial distance from Up, i.e., 
C/4 in Table m which is {a,b,c,d}. As compared to the best 
feasible solution {a, c, d, e} obtained in Figure |3l the total 
spatial distance of {a, 6 , c, d} is 10, which is smaller than that 
of {a, c, d, e}, i.e., 13. 
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SSGMerge involves two parameters, w and A, and termi¬ 
nates the search process after w states have been generated 
in the branch-and-bound tre^E SSGMerge then refines fhe 
solutions with the above merge approach. By effectively 
restricting the number of generafed intermediate solutions, 
SSGMerge can efficiently construct a good feasible solution 
according to the following theorem. 

Theorem 3: The running time of SSGMerge is 
0(pA^ logpA -I- + |G|(log |G|)^). 

Proof: SSGMerge firsf generates w nodes in the branch- 
and-bound tree before if merges fhose infermediafe solu¬ 
tions and creates feasible solutions. Three operations are 
performed: 1) Socio-Spatial Ordering, 2) Disfance Pruning, 
and 3) Familiarify Pruning. Socio-Spatial Ordering includes 
Distance Ordering and the checking of Eq. (1) in Section 
4.1. Distance browsing strategy, i.e., iteratively extracting the 
candidate attendee with the minimum spatial distance to 
q from R-Tree, in Disfance Ordering is performed w times. 
Therefore, in fhe worsf case, the number of R-Tree leaf node 
access is 0(|D|), and the traversal from fhe root to a leaf 
node of R-Tree incurs 0(log |D|) R-Tree infernal node access. 
Since each R-Tree node access incurs 0(1) time for distance 
computation, the time of R-Tree node access is 0(|G| log |G|). 
The priorify queue maintained for Distance Ordering takes 
O(logs) time for each insertion and deletion operation, 
where s is the size of fhe priority queue. Since there are 
0{\V\ log |D|) elements inserted into the priority queue, and 
the insertion cost of each element is 0(log(|G| log |y|)) (in 
worst case, the size of fhe priority queue is 0(|y| log |D|)). 
Therefore, the total cost is 0{\V\ log |G|)-0(log(|G| log |1^|)) = 
0(|G|(log|G|)2). 

For checking Eq. (1) of Socio-Spafial Ordering, since fhe 
size of Si does not exceed p, it requires O(p^) time to 
compute F{Si U {?;}) for each examination, i.e., examining 
if a verfex v can be included in fhe current Sj. Therefore, 
checking Eq. (1) for w times fakes 0{wp^) time. 

Familiarity Priming is performed in 0{wp^) time for w ex¬ 
aminations. Distance Pruning at each time examines the first 
element of the priority queue and the total spatial distance in 
Si with 0{p) time. Therefore, Disfance Pruning fakes 0{wp) 
time for w examinations in the branch-and-bound tree. In 
summary, the time complexity of w atfempfs for including 
a node into Si is 0{wp^) + 0{wp) + 0(|y|(log |y|)^) = 
0{wp^ + |G|(log |G|)^). 

On the other hand, when SSGMerge merges intermediate 
solutions, at each Ui, it first ranks the intermediate solutions 
in Ui with the ranking function and then discards those 
with ranks higher than A. This step takes 0{p}?\ogpX) 
time because in the worst case, each merged intermediate 
solution in C/i, 1 < f < j is inserted into Uj. It costs O(A^) 
time for SSGMerge to combine each pair of intermediate 
solutions within each Ui for 1 < f < p, including checking 
the pruning condition. Therefore, if takes O(pA^) time for 
merging the intermediate solutions. Overall, the running 
time for SSGMerge is 0{p\^ logpA -i wp^ -i |G|(log |G|)^). □ 

Please nofe fhat \V\ in the complexity comes from fhe 
worst case of R-Tree disfance browsing. However, with 
the assumption of uniform distribution of fhe candidates' 
locations, the expected time of R-Tree disfance browsing 
becomes 0(u;log|y| • log(i(;log |H|)). More importanfly, the 
experimental results manifest that SSGMerge is much faster 
than SSGS because SSGMerge effectively merges intermedi- 

5. Detailed settings of w and A will be presented in the experimental 
results. 


ate solutions into good feasible solutions to avoid examining 
the large search space. 

5 Algorithm Design for MRGQ 

In this section, we turn our attention to Multiple Rally-Point 
Social Spatial Group Query (MRGQ), which finds 1) the most 
suitable activity location from a set of candidafe locations 
and 2) a socially acquainted group with the minimal total 
spatial distance to the activity location. More specifically, 
MRGQ aims to find a pair {F,q*), where F is a socially 
acquainted group of p people satisfying the familiarity con¬ 
straint, and q* G Q is a location in Q such that {F, q *) incurs 
the minimum total spatial distance. MRGQ is more difficult 
than SSGQ since different candidate social groups are closer 
to different locations, which need to be carefully considered 
as well. 

To address the issue of multiple candidate locations, a 
straightforward approach is to repeat the SSGS algorithm | Q \ 
times to sequentially find the best group for each location. 
Nevertheless, this straightforward approach is not efficient 
because a spatial correlation may exist among multiple 
activity locations and thus can be exploited. In addition, 
it is desirable to design some effective index structures to 
facilitate efficient traversal and pruning of the search space. 
In this work, we propose to index the candidates with an R- 
Tree, while indexing the activity locations with a BallTree 
fill . Accordingly, we design new ordering strategies to 
quickly identify an activity location near an intermediate 
group of candidafes satisfying the familiarity constraint and 
pruning strategies to avoid generating redundant {F,qi) 
pairs, where F is a group of p candidates satisfying the 
familiarity constraint. Moreover, two effective strategies for 
fraversing the search space are proposed, including All-Pair 
Distance Qrdering and Single-Reference Distance Qrdering. 
Processing time is also improved by introducing a number 
of new search space pruning rules, including Inner-Triangle 
Distance Pruning, Quter-Triangle Distance Pruning, and Ac¬ 
tivity Location Distance Pruning. In summary, during the 
process of selecting attendees and an activity location, we 
exploit both the spatial distances among different candidate 
locations as well as the distances from attendees to activity 
locations to effectively prune redundant search space to 
efficiently find the optimal solution. 

In Section |2l we present an Integer Linear Programming 
(ILP) formulation for MRGQ which can obtain an optimal 
solution via a commercial solver, such as the IBM CPLEX 
[IJ parallel optimizer, one of the fastest commercial parallel 
solvers. However, as shown in Section [71 this still requires 
an unacceptable amount of time to find the optimal solution 
because MRGQ needs to simultaneously process the spatial 
and social dimensions. Therefore, in Section [521 we design 
a new algorithm to efficiently process MRGQ. 

5.1 Baseline Algorithms for MRGQ 

The baseline algorithms are extensions of SSGS mentioned 
in Section |4| While Socio-Spatial Qrdering and Distance 
Pruning remain the same, we extend Familiarity Pruning 
introduced in Section l4(2l to tailor the familiarity constraint 
for MRGQ. Specifically, if one of the following conditions 
holds. Familiarity Pruning stops moving any candidates into 
Si, and the algorithm backtracks to the previous step to 
consider other candidate attendees. 


\Si\ — min |7V„ fl S'/I > fc + 1, or (4) 

v^Si 

Y, \Sr n N,\ <ip- |S/|)(p - |S/| - fc - 1), (5) 

VSSr 



where Ny is the set of neighbors of in y. 

In Eq. (3), mini,gSj |A^„ fl S/| represents the minimum 
number of neighbors for each individual in S/. In other 
words, |S/|—min„gs^ |7V„nS/| —1 is the maximum number of 
unacquainted members for v in S/, and —1 is incorporated 
above to exclude v herself. If |S/| — min^gg^ |iV^ nS/| — 1 > k, 
at least one individual in S/ has more than k unacquainted 
members in S/. This situation violates the familiarity con¬ 
straint. Therefore, the pruning strategy holds since any 
group growing from the current S/ will never satisfy the 
familiarity constraint. 

Eq. (lU considers the vertex degrees of the individuals 
in Si- In contrast, the pruning condition specified in Eq. 
lO considers the degrees of the individuals that have not 
been moved into Si, i.e., those individuals that are in Sr. 
In the right-hand-side (RHS) of Eq. ||5]l, {p — 15/1) is the 
number of individuals that need to be moved from Sr to 
Sj. On the other hand, for any solution group that satisfies 
the familiarity constraint, the degree of each member is 
at least {p — k — 1) in the group. Therefore, if Sr has an 
individual u with the number of neighbors in Sr smaller 
than {p — \Si\ — k — 1), Si will never grow into a feasible 
solution when u is selected into 5/. In other words, if the 
total number of neighbors that all individuals in Sr have 
|5/iniV„|) is smaller than (p-|5/|)(p-|5/|-fc-l), 
selecting any {p — |5/|) individuals from Sr into Si will 
never generate a feasible solution, and thus this intermediate 
group can be trimmed accordingly. 

For exa mple , if p = 4, fc = 0 and the social graph is shown 
in Figure |2(a) If Si = {a, e}, then this Si can be pruned 
by Eq. (|4) since 2 — 0 > 0 -I- 1 holds, i.e., at least one vertex 
in current Si does not have enough friends to satisfy the 
familiarity constraint. Similarly, if p = 5, fc = 0 and Si = {a}, 
Sr = {&, c, d, e, /}, Si can also be pruned by Eq. ||5j because 
l-h2-hl-h3-hl < (5- 1)(5 - 1 - 0 - 1) holds, i.e., the 
candidates in Sr do not provide sufficient social tightness 
for the current Si to satisfy the familiarity constrain!. 

In the following, we inlroduce two baseline algorithms, 
namely SSP and SFGP. 

Sequential SSGQ Processing (SSP). As discussed earlier, an 
intuitive approach for answering MRGQ is to sequentially 
invoke algorithm SSGS for each activity location. However, 
even though the intermediate best solufion can be exploited 
to prime inferior solutions not yet examined, this approach 
still incurs a huge query processing cost because it does not 
simultaneously trim multiple activity locations. Therefore, 
we improve SSP to SFGP as follows. 

Sequential Feasible Groups Processing (SFGP). In contrast 
to SSP that sequentially explores \Q\ branch-and-bound trees 
(i.e., one for each activity location), SFGP constructs only one 
branch-and-bound tree to facilitate joint exploration of the 
spatial and social dimensions. In addition to Si and Sr, for 
each node in the tree, SFGP also maintains a set Qi of re¬ 
maining activity locations that need to be explored. Initially, 
setting Si = 0, Sr = V, and Qi = Q, SFGP first finds a 
reference activity location S Qi to guide the exploration, 
where qref is the closest location to a candidate attendee 
u £ Sr (i.e., Qref and u are the spatially closest pair). As 
such, Qref can lead to a smaller total spatial distance in early 


(a) Social graph and spatial distances. 



(b) Branch-and-bound tree of SFGP. 


Fig. 4: Example of SFGP. 


stages of SFGP. Afterwards, SFGP moves candidates from 
Sr into Si according to Socio-Spatial Ordering (introduced 
in Section l4Tll based on Qref - After moving a candidate from 
Sr into Si, SFGP determines whether Si can be pruned by 
Familiarity Pruning mentioned in Eqs. (|4j and ([Jl. If Si is 
pruned by Familiarity Pruning, SFGP stops moving candi¬ 
dates into the current Si and backtracks because the current 
Si cannot grow into any feasible solutions. Moreover, each 
time a candidate is moved into Si, SFGP examines each 
activity location qi £ Qi with the Distance Pruning condition 
(introduced in Section 14.2b . An activity location qi will be 
removed from Qi if it is distant from most members in Si 
(i.e., qi is pruned by Distance Pruning). While expanding 
Si, if Qi becomes empty (i.e., all activity locations in Qi are 
pruned), SFGP stops the expansion and backtracks. 

When Si contains exactly p candidates and satisfies the 
familiarity constraint, SFGP computes the spatial distances 
from Si to each activity location in Qi, and extracts the 
activity location q £ Qi which incurs the minimum spa¬ 
tial distance to 5/. If the spatial distance from Si to q is 
smaller than the current minimum distance D, SFGP records 
{Si, q), updates D and backtracks to examine other possible 
solutions. When the search space is explored, SFGP outputs 
the recorded best solution and the corresponding activity 
location. 


Figure |4(b) presents an example of SFGP to show that 
the size of Qi can rapidly decrease when a few more 
candidates are moved into 5/. The social network and the 
correspon ding spatial distances to each qi £ Qi are shown 
in Figure |4(a) Assume p = 3 and fc = 0, at the beginning. 
Si = 0, Qi ^ 92 , 93 } and Sr = V. In step (1), SFGP first 

identifies qref = 92 because q 2 and candidate attendee a are 
the spatially closest pair while a £ Sr, and then SFGP moves 
a from Sr into 5/. In step (2), SFGP moves b from Sr into Si. 
Note that b is the candidate attendee in Sr who is closest to 
qref - Here, moving b into Si follows Socio-Spatial Ordering 
(SSO), and Si U {6} is not primed by Familiarity Pruning. 
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In step (3), SFGP moves c into Sj where Sj = {a, b, c} 
satisfies the familiarity constraint. SFGP then scans over the 
activity locations in Qi and extracts 92 because 92 incurs 
the minimum total spatial distance to Sj. SFGP updates the 
currently best solution {{a,b,c},q 2 ) and its distance value 
(i.e., D = 6) and backtracks to the previous state as step (4), 
i.e.. Si = {a,b} and Qj = { 91 , 92 , 93 }- SFGP then discovers 
that by applying Distance Pruning, all the activity locations 
in Qi can he removed, i.e., moving d, e and / into Si 
does not generate a better solution. Therefore, SFGP stops 
expanding the current Si and backtracks through step (5). 
Now, Si = (a) and SFGP moves c into Si in step (6). In this 
case, 93 can be removed from Qi because Distance Pruning 
indicates that 93 will never lead to any better solutions given 
the current Si. Therefore, SFGP only needs to examine 91 
and 92 in the future expansion of the current Si. During 
the process, if SFGP finds a feasible solution with a distance 
better than D, it records the solution and update D. SFGP 
repeats the above procedures and returns the best solution 
after the search is complete. 

As compared to SSP, SFGP jointly examines the activity 
locations and candidate attendees, and employs Distance 
Pruning to effectively remove the activity locations that 
do not lead to better solutions. It then utilizes Familiarity 
Pruning to discard the intermediate groups that carmot grow 
into feasible solutions. Moreover, SFGP avoids the repeated 
explorations of different social groups, i.e., the same social 
group may be generated and examined for \Q\ times in SSP. 
As shown in Section [71 SFGP outperforms SSP. However, 
after carefully examining SFGP, we still find a number of 
areas that can be further improved, and thus propose a more 
efficient algorithm as detailed below. 

5.2 Algorithm MAGS for MRGQ 

Although SFGP is able to prune redundant activity locations, 
it relies on sequential scans over Qi to determine whether 
a location in Qi can be safely pruned. Therefore, for every 
Qi in Qi, SFGP has to calculate a lower bound on the total 
spatial distance of the feasible solution generated from Si 
and qi according to Distance Pruning. On the other hand, 
identifying qref needs a scan over the activity locations in 
Qi. Moreover, the selected qref may not always be good 
because SFGP decides qref before the first candidate attendee 
is moved into Si, instead of adaptively changing qref as Si 
grows. 

To address these issues, we propose an algorithm, namely 
Multiple Activity-Location Group Selection (MAGS), to effi¬ 
ciently process MRGQ. Similar to SFGP, MAGS processes 
multiple activity locations simultaneously. However, MAGS 
incorporates the following new ideas: a) an index of activity 
locations, b) new distance ordering strategies, including 
Single-Reference Distance Ordering and All-Pair Distance 
Ordering, and c) new distance pruning strategies, including 
Activity Location Distance Pruning, Outer-Triangle Distance 
Pruning and Irmer-Triangle Distance Pruning. Using an 
index for the activity locations avoids sequential scans of 
the activity locations in Qi (i.e., for the selection of qref 
and pruning of unnecessary locations). The new distance 
ordering strategies obtain qref more efficiently and enable 
qref to change during the expansion of Si. As a result, 
feasible solutions witb smaller total spatial distances can 
be obtained more effectively. Moreover, the new distance 
pruning strategies exploit the interplay between Si and the 
activity locations, as well as the mutual distances of different 



Fig. 5: Comparisons of R-Tree and BallTree. 

locations, to effectively and simultaneously prune multiple 
activity locations. 

5.3 Indexing the Activity Locations 

As previously mentioned, SFGP incurs many sequential 
scans over the activity locations due to Distance Pruning, 
i.e., each time a candidate is moved into Si, Qi needs to 
be scanned to determine whether some activity locations 
can be primed. Moreover, as SFGP extracts qref & Qi and 
u € Sr at the beginning, qref is not always the closest 
activity location for Si to be expanded afterward, especially 
when Si does not include u. Therefore, the proposed All- 
Pair Distance Ordering (APDO) is designed to dynamically 
select qref S Qi and u € Sr according to the current 
Si (as described in Section 5.4). More specifically, the next 
attendee u that will be moved to Si and the corresponding 
qref need to minimize the total spatial distance from Si U 
{u} to qreft Le., millu^SR.qref&Ql }■ 

Equipped with APDO, MAGS finds good feasible solutions 
more quickly and prunes search space with distance pruning 
strategies. However, this approach needs 0(|U|) sequential 
scans over Qi before a new candidate attendee is identified 
and moved into S'/. 

One way to avoid sequential scans over Qi is to index 
the activity locations in an index structure. This may facil¬ 
itate rapid estimation of fhe spafial disfances from activity 
candidates to potential activity locations and thus allow dis¬ 
tance pruning strategies to immediately remove redundant 
activity locations from Qi. With such an index structure, 
triangular inequality may be exploited in distance pruning 
strategies to further reduce distance computations (detailed 
later). Although the index structure has to be constructed at 
runtime, it can be reused many times in query processing. 

We adopt BallTree fill to index the activity locations. In 
BallTree, each activity location qi G Q is stored as a leaf 
node, and each internal node in BallTree is the smallest ball 
covering all the children balls. Here, a ball B is associated 
with its center ctr{B) and radius r{B). The distance lower 
bound from a candidafe u fo a ball B on 2D space can be 
compufed as MINDIST{u, B) = du^ctr(B) The leaves 

of tbe BallTree are the activity locations, while the internal 
nodes in the tree corresponds to a ball containing multiple 
activity locations. 

BallTree enables the removal of many unq ualified lo- 
cafions af once, as illustrafed in Figure |5(a) To simul¬ 
taneously explore and prune multiple activity locations, 
a lower bound on the total spatial distance from Si = 
{si,S 2 } to a ball, e.g., Bi, can be derived. If this distance 
lower bound exceeds the currently best solution value D, 
it assures that no activity location in Bi will produce 
a better solution with any social group grown from Si. 
Thus, all activity locations in Bi can be safely pruned. In 
Figure |5(a)| '^^_^g^MINDIST{.Si,Bi) serves as a lower 
bound on the total spatial distance from Si to 91 and 92 . 
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Moreover, we can employ triangular inequality to avoid 
the distance computation of es MINDIST{si,B 2 ), i.e., 
J:s ^S,MINDIST{s.,B 2) = ts^^sMsuctriB.) - ^iB2)) < 
l-5/f dctr{Bi),ctr(B2) XsjGSj ^si,ctr{B\) I'S'/I ■ r(i? 2 )' There 
fore, only the distance from ctr{Bi) to ctr{B 2 ) needs to be 
computed, together with XsiCS/ ds^^ctr{Bx)r to derive a lower 
bound on the spatial distance from Si to B 2 . In summary, 
instead of invoking sequential scans which need \Si\ ■ m 
distance computations to find the total spatial distances from 
Si to m activity locations, indexing activity locations in 
BallTree requires only 15/1 + (n — 1) distance computations, 
where n is the number of balls. 

An alternative index is R-Tree, but we argue that Ball- 
Tree is more suitable for indexing activity locations here. 
Figure |5(b)| illustrates an example where the activity lo¬ 
cations are indexed in an R-Tree. As shown, minimum 
bounding rectangles (MBRs) are used to provide bound¬ 
ary information over locations inside them. In Figure |5(b)| 
MINDIST{si,Mi) serves as a lower bound on 
the total spatial distance from Si to qi and q 2 , where 
MINDIST{si,Mi) denotes the minimum distance from Si 
to MBR Ml. However, it is difficult to employ triangular 
inequality with R-Tree to quickly obtain a lower b ound 
on Xc cc;, dT/AI?/5T(si,M 2 ). As shown in Figure |5(b)l 
where MINDIST{si,Mi) + x > MINDIST{si, M 2 ) holds, 
the inequality MINDIST{si, Mi) + MINDIST{Mi, M 2 ) > 
MINDIST{si,M 2 ) is not guaranteed to hold because 
MINDIST{Mi^ M 2 ) < X. Therefore, it is necessary to com¬ 
pute MINDIST{si,M 2 ) and MINDIST{s 2 , M 2 ) directly, 
incurring 15/1 -h on-line distance computations to derive all 
lower bounds, where n is the number of MBRs. In contrast, 
BallTree needs only |5/| -I- (n — 1) distance computations with 
n balls. Therefore, BallTree is preferable to R-Tree in our 
MAGS design. 

BallTree brings two advantages to MAGS: 1) BallTree 
enables the design of efficient distance ordering strategies. 
By traversing both R-Tree (for indexing candidate attendees) 
and BallTree (for indexing activity locations), our proposed 
distance ordering strategies avoid redundant examinations 
of candidate attendees and activity locations to extract the 
reference activity location qref- The new distance ordering 
strategies, combined with the original Socio-Spatial Order¬ 
ing mentioned in Section 14.11 are promising to find good 
feasible solutions quickly and prune redundant search space 
effectively. 2) BallTree enables distance-based pruning of 
activity locations at once in the early stages. Moreover, the 
lower bound on the total spatial distance from a set of balls 
to Si can be quickly obtained to facilitate pruning. In the 
following, we first propose two distance ordering strategies 
and then introduce the distance pruning strategies based on 
R-Tree and BallTree. 

5.4 Distance Ordering 

While Socio-Spatial Ordering in SSGS is applicable to MAGS, 
its design does not consider selections of activity locations. 
Here we propose two new distance ordering strategies for 
MAGS: (1) Single-Reference Distance Ordering (SRDO). It 
selects the activity location along with the first candidate 
attendee, Vseed, for Si. Note that the total spatial distance of 
the feasible solutions obtained by SRDO may not be minimal 
since only a single location qref is fixed as a reference. (2) 
All-Pair Distance Ordering (APDO). It adaptively changes 
the optimal activity location according to different Si, and al¬ 
ways chooses the best activity location when a new attendee 



Fig. 6: Example of SRDO. 


is included into Si to minimize the total spatial distance 
from Si to the new reference activity location qref- 

Single-Reference Distance Ordering (SRDO). At the be¬ 
ginning, i.e.. Si = 0, SRDO starts by selecting a seed 
candidate v^eed arid a reference activity location qref such 
that is minimal. However, to avoid excessive dis¬ 

tance computations, we fix qref as Si grows. While SRDO 
requires later examination of other activity locations, the 
minimized distance may effectively eliminate consideration 
of many potential activity locations. To efficiently obtain 
Vseed and qref, we traverse R-Tree (indexing the candidate 
attendees) and BallTree (indexing activity locations) simul¬ 
taneously, to reduce the number of distance computations. 
To further improve the efficiency, a distance lower bound 
from any candidate within an MBR Mi to any activity 
location within a ball Bj, MINDIST{Mi,Bj), is derived 
as MINDIST{Mi,Bj) = MINDIST{Mi,ctr{Bj)) - r{Bj), 
where MINDIST{Mi,ctr{Bj)) is the minimum distance 
from Mi to the center of Bj, and r{Bj) is the radius of 
Bj. MINDIST{Mi,Bj) represents a distance lower bound 
from any candidate within Mi to any activity location in 
Bj, which is particularly useful to determine redundant 
examinations of candidate attendees and activity locations 
located in distant MBRs and balls in R-Tree and BallTree. 

More specifically, SRDO maintains two lists, Uu and Ub, 
to record the traversal status of R-Tree and BallTree. Ini¬ 
tially, we insert the root of R-Tree into Uu and the root 
of BallTree into Ub. Then, at each stage, we find the MBR 
Mi in Ur and the ball Bj in Gs that incur the minimum 
MINDIST{Mi,Bj). If Mi is not a leaf node in R-Tree, we 
pop Mi from Ur and insert its children back into Ur, while 
a non-leaf node in BallTree is performed similarly. If the 
extracted Mi and Bj are both leaf nodes, they are assigned 
as Vseed sod qref, respectively. Note that the entries in Ur 
and Ub are popped in accordance with the shortest distance 
between them, Vgeed and qref are indeed the closet attendee- 
location pair. Each candidate attendee Vi and activity loca¬ 
tion qj in any other MBR Mi and ball Bj must incur a larger 
spatial distance since MINDIST{Mi, Bj) is a lower bound, 
and < MINDIST{Mi,Bj) < Therefore, 

this approach effectively avoids examining attendees and 
locations that are mutually distant because their correspond¬ 
ing MBRs and balls will never be extracted from the lists. 
Moreover, if > t (where t is the spatial radius), 

MAGS can stop since there is no feasible solution in this 
case. 

Figure [6] presents an illustrative example for SRDO. As¬ 
sume there are four candidates {a,b,c,d} indexed by an 
R-Tree and four activity locations {< 71 , 92 , 93 , 94 } indexed 
by a BallTree. To find Vseed and qref, we first insert the 
root of R-Tree, Mq, into Ur, and insert the root of Ball- 
Tree, Bq, into Ub- There is only one element in each list, 
and MINDIST{Mq,Bq) = 0 since they overlap. Thus, 
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Fig. 7: Example of APDO. 


SRDO extracts Mg and Bq and insert their children into 
Ur and Ur, respectively. Now, Ur — {Mi,M 2 } and Ur — 
{Bi, B 2 }. SRDO then extracts M 2 and B 2 from each list since 
MINDIST{M 2 , B 2 ) is the smallest one. Afterwards, we 
insert the children of M 2 and B 2 into the lists, respectively, 
and now Ur = {Mi, c, d} and Ur = {Bi, 93 , 94 }. SRDO finds 
that d and (74 incur the minimum spatial distance and assigns 
Vseed as d and q^ef as q^. 

Once Vseed and qref are extracted, qref in SRDO is fixed. 
The candidate attendees chosen later still need to follow 
Socio-Spatial Ordering to maintain the required social tight¬ 
ness of Si, and Familiarity Pruning is employed to prime the 
intermediate solutions that will not become feasible groups. 
Moreover, distance pruning strategies based on R-Tree and 
BallTree are employed to remove activity locations (detailed 
later) that will never produce a better solution. 

All-Pair Distance Ordering (APDO). With SRDO, as Si 
grows, the qref selected initially may not be the eventual ac¬ 
tivity location with the minimum total distance to Sr Figure 
[7] presents an illustrative example with p = 3 and {a, b, c, dj 
as the candidates indexed by R-Tree, while {< 71 , 92 , 93 , 94 } 
are activity locations indexed by BallTree. SRDO finds d 
for Vseed and 94 for qref- Thereafter, a and c are moved 
into Si- However, since a and c are distant from 94 , the 
solution obtained by SRDO, i.e., ({a, c, d}, 94 ), incurs a large 
total spatial distance. In contrast, a better feasible solution is 
({a, b, d}, 93 ), which greatly reduces the total spatial distance. 
Therefore, we propose All-Pair Distance Ordering (APDO), 
to select proper candidates from Sr and adaptively switch 
qref to the most suitable activity location. 

We propose All-Pair Distance Ordering (APDO) to select 
proper candidates from Sr and adaptively switch qref to 
the most suitable activity location. More specifically, APDO 
simultaneously chooses qref and a candidate attendee Vc 
to expand Si at each iteration, such that the total spatial 
distance from Si to the selected qref is minimized, i.e.. 


min 

C Sr jQref I 



A straightforward approach to select Vc and qref is to scan 
over the entire sets of Sr and Qi. However, this approach 
requires (| 5 'fl| -I- \Si\) ■ \Qi\ distance computations when we 
move a candidate into Sr To reduce this overhead, we 
traverse both R-Tree and BallTree simultaneously, to reduce 
unnecessary distance computations. 

Two lists Ur and Ur are maintained during the traversal 
of R-Tree and BallTree. At each stage, MBR Mi and ball Bj 
are extracted from Ur and Ur based on the following score 
function. , 

min i V MINDIST(v,Bi) + 


MINDIST{M,,Bj)} (7) 

where MINDIST{v,Bj) = dyetr{Bj) ~ 
MINDIST{Mi, Bj) carmot exceed t. In Eq. the first term 


represents the minimum total spatial distance from Si to any 
activity location within Bj, while the second term represents 
the minimum spatial distance from a candidate attendee in 
Mi to an activity location in Bj. 

After extracting Mi and Bj from Eq. ( 0 , if Mi is not a leaf 
node on R-Tree, we pop it from Ur and insert its children 
into Ur. Similarly, if Bj is a non-leaf node on BallTree, we 
also pop it from Ur and insert its children into Ur. As such, 
APDO extracts Vc and qref without accessing the candidate 
attendees and activity locations distant from each other. We 
repeat the above procedure until Mi and Bj are both leaf 
nodes and Mi G Sr. Finally, we move Vc from Sr into Si and 
continue the branch-and-bound search. Moreover, during 
the above procedure, if MINDIST{vc, Bi) > t for a ball Bi, 
all the activity locations within Bi can be removed from Qi 
since no activity locations in Bi satisfies the spatial radius 
constraint. APDO iteratively extracts Vc and qref which incur 
the minimum spatial distance so as to avoid the situation 
where qref is only close to a small number of candidate 
attendees but distant from the others. Moreover, APDO also 
allows for the early pruning of activity locations that are 
distant from the candidate attendees. This effectively reduces 
computation overhead when performing distance pruning 
strategies afterwards. 

Figure [ 7 ] presents an example with four candidates 
{a, b, c, d} indexed by R-Tree and four activity locations 
{91,92,93,94} indexed by BallTree. Initially, when Si = 0, 
APDO finds the first Vc and the corresponding qref as 
follows (see the first column in the table). APDO first inserts 
the root of R-Tree, Mg, into Ur, and inserts the root of 
BallTree, Bq, into Ur. There is only one element in each 
list, and MINDIST{Mq,Bq) — 0 since they overlap. Thus, 
APDO extracts Mg and Bq and inserts their children into 
Ur and Ur, respectively. Now, Ur = {Mi,M2} and Ur — 
{Bi,B2}. APDO then extracts M2 and B2 from each list since 
MIN DIST{M2t B2) is the smallest one. Afterwards, we 
insert the children of M2 and B2 into the lists, respectively, 
and now Ur = {Mi,c,d} and Ur = {51,93,94}. APDO 
finds that d and 94 incur the minimum spatial distance, 
and the first candidate to be moved into Si is d with the 
corresponding qref as 94. 

To choose the second candidate Vc and update qref (see the 
second column in the table), we insert the roots Mg and Bq 
of the R-Tree and BallTree into Ur and Ur, respectively. Then 
Mg and Bq are extracted to insert their children, i.e., Ur = 
{Ml, M2} and Ur = {Bi,B2}. Since M2 and B2 minimize 
Eq. 0 , their children are inserted, i.e., Ur = {Mi,c} and 
Ur = {51,93,94}. Note that here d is not inserted into Ur 
since it is not within Sr. Now, Mi and 93 minimize Eq. 
0 since MINDIST{Mi,q3) = 0, i.e.. Mi and 93 overlap. 
Therefore, Mi is popped from Ur with its children inserted 
back into Ur. Thus, Ur = {a, 6 , c} and Ur = {51,93,94}. 
Among them, MINDIST{v,q3) -I- MINDIST{a,q3) 

is the minimum. In other words, Vc = a and qref =93- It is 
worth noting that at this stage, qref changes from 94 to 93 
since 93 incurs a smaller total spatial distance to Si U {a}. 
Therefore, the second candidate to be moved into Si is a. The 
third column in the table details the extraction of the next 
Vc and the corresponding qref, where Vc = b and qref = 93- 
After b is moved into Si, Si = {a, 6 , d}, qref — 93 is the 
first feasible solution. In addition, APDO does not need to 
examine the children of 5 i since they are far away from the 
candidates. 
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Fig. 8: Outer-Triangle Distance Pruning. 

5.5 Distance Pruning Strategies 

To avoid examining redundant activity locations, a simple 
approach is to apply Distance Pruning (see Section 14.2b to 
derive the lower bounds on the total spatial distance from 
Si to each activity location. If the lower bound is larger than 
the currently best solution value, the activity location can 
be safely discarded from future expansions of Sj. However, 
the above approach is computation intensive because the 
total distance from each attendee in Si to each activity 
location needs to be obtained. In the following, we introduce 
a number of new pruning strategies designed to boost the 
efficiency in trimming redundant search space when a new 
attendee is added to Si. 

We first propose Outer-Triangle Distance Pruning (OTDP) 
and Inner-Triangle Distance Pruning (ITDP) to derive the dis¬ 
tance lower bounds with triangular inequality, which incur 
only small computation overhead. We then propose Activity 
Location Distance Pruning (ALDP), which derives the lower 
bounds with the help of R-Tree and BallTree to facilitate 
priming of activity locations in balls simultaneously. In the 
following, we first discuss OTDP and ITDP for pruning 
single locations (point versions). This is then extended to 
pruning balls of locations (ball versions). Since points can 
be viewed as degenerated balls, the point versions of OTDP 
and ITDP can be treated as special cases of ball versions. 

Outer-Triangle Distance Pruning. The strategy is to derive 
a lower bound on the total spatial distance from Si to an 
activity location qy according to the total spatial distance 
from Si to another activity location derived before. Here, 
Outer-Triangle indicates that the derivation of triangular 
inequality is through activity locations, i.e., outside Si. On 
the other hand, Inner-Triangle Distance Pruning (which will 
be detailed later), derives the distance lower bounds with 
triangular inequality purely based on the attendees in Si. 

Consider an activity location qy under examination. Let qx 
be an examined location, dsi,q^ denote the spatial distance 
from an attendee Si G Si to qx, and de note the spatial 

distance from qx to qy. As shown in Figure |8(a)| the lower 
bound on the spatial distance from Si to qy can be derived 
- dsi,q^ < dsi^qy according to triangular inequality. 
Therefore, a lower bound on the total spatial distance from 
Si to qy could be computed as - 4^,,^) = 

I'S'/I • dq^.qy - Slffi On the other hand, to compose 

a group with exactly p attendees, MAGS needs to select the 
remaining p — IS"/! attendees from Sr into S'/. A lower bound 
on the total spatial distance of these p— |S/| attendees to qy 
is (p - \Si\) ■ where dy^,^^q^ denotes the minimum 

spatial distance from qy to any candidates in Sr. Therefore, 
let D denote the currently best solution value, the following 
lemma specifies Outer-Triangle Distance Pruning. 

Lemma 1: If \Si\-dq^,q^-J2'fj} ds,,q, + {p-\Si\)-dy^,^,q^ > D, 


qy never produces a better solution for any set of candidates 
expanded from S/. 

Proof: In the above inequality, the first two terms repre¬ 
sent a lower bound on the total spatial distance from Si 
to qy, and the third term is a lower bound on the total 
spatial distance from qy to an y (p — \Si\) candidates in 
Sr. As shown in Figure |8(a) from triangular inequality. 


if d, 
VI 


qx,qy 

< i 


> maxs-es/ d. 


Si.qxf 


then d, 


Qx jQy 


- d. 


tQx 


< ds 


,qy' 


< l^/l must hold. Therefore, 

d, 


ds,,qj < E 1=1 ds,,qy. 


which can be written as \Si\ ■ u,q^,qy 
Note that do 


= 1 '^Si,qx Z-(i=l ^qx,qy — mS-XsiGS/ dg^^q^ 

is necessary to be satisfied, otherwise the left-hand-side 
of dqx.qy - dsi.qy, < dg^^q^ is not guaranteed to be a non¬ 
negative value, and 
act as a lower bound on dg^^q^. On the other hand, 

(p — \Si\)dv^.^^q^ is a lower bound on the total spatial from 
(p — l^/l) candidates in Sr to activity location qy. Therefore, 


— dsi.q^) is not able to 


l'5/l -E1 =i dg,,q^ + {p-\Si\)d, 

on the total spatial distance from any set of p candidates 
expanded from Si to qy. In summary, if Outer-Triangle 
Distance Pruning condition holds, the total spatial distance 
from any set of p candidates expanded from Si to qy always 
exceeds or equals to D. □ 


'min jQy 


is a lower bound 


Since d, 


only need to compute dq^ ^q^ instead of each d. 


is computed when we access qx, we 

. More im¬ 
portantly, it is possible to improve Outer-Triangle Distance 
Pruning from a single location to a ball of locations, as shown 
in Figure [8(b)) to prune multiple redundant activity locations 
in the early stages of MAGS. 

Specifically, when we consider two balls Bx and By 
instead of two locations qx and qy, a lower bound on 
the spatial distance from Si G Si to any location in By 


can be computed as dQ ix(B...).ctr(B.,) dg^ cix^n^'j v^Byf as 

shown in Figure |8(b)[ "Therefore, a lower bound on the 
total spatial distance from Si to any location in By is 
\^i\'dctr{Bx),ctr{By) dg^ oix{Bx) \^r\'r{Byf Moreover, 

MAGS also derives a lower bound of the remaining p— \Si\ 
attendees as (p— |5'/|) minMi^UR MINDIST{Mi, By), where 
minMieUR MINDIST{Mi, By) denotes a lower bound on 
the spatial distance from the locations in By to its closest 
candidate attendees (i.e.. Mi). In summary, given the cur¬ 
rently best solution value D, ball Bx and Si, a ball By can 
be pruned according to the following lemma. 

Lemma 2. If \Si\ • dotr{Bx).ctr{By) dsi,ctr{Bx) \^r\ 

r{By) + {p - \Si\)mmMiGURM:iNDIST{Mi,By) > D, the 
activity locations within ball By never produce a better 
solution for any set of p candidates expanded from S'/. 

Proof: As shown in Figure |8(b)l if maxs-gs^ dg.^ctr{Bx) < 
dctr{Bx),ctr(By), accordiug to triangular inequality, 

dctr{Bx),ctr{By) dg-^ctr^Bx) S dg-ctr^By) mUSt hold. 

Therefore, ^^si^Si^dctr(Bx).ctr{By) dg^otr{Bx)) 

\Si\ ■ dctr{Bx),ctr(By) E. dgi,ctr{Bx) is ^ lower bound 
on J2gieSi dsi.ctriBy), i-e., the total distance from S/ 
to ctr{By). In addition, since dg.^ctr^By) — f{By) is a 
lower bound on the distance from Si to any activity 
location in By, the lower bound on the total spatial 
distance from any activity location within By to S/ is 
thus |S/| • dotr[Bx).ctr{By) dg^^ctr{Bx) ' ^{ddy). 

Moreover, the lower bound on the total spatial distance 
from any activity location within By to (p — |S/|) candidates 
in Sr is (p— \Si\)mmMieURMINDIST{Mi,By). Therefore, 
if the condition of Outer-Triangle Distance Pruning holds. 
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(I'S'/I - 1) E1=i ds,,q^ > E1=i ^ El=i +1 ds,,s^, and we have 
E 1 =i' 4 ,. 9 . > ]s 7 ^El=i'”^Ei=i+i 4 ,,s,. In other words, 
S£i+i ds,.s, acts as a lower hound on the 


^Qx ^ /^i—l ' 

1 Y^|S/| I 

^j=i+l ^Si, 

|Sj|-l ^j=l ^j=i+l “st.sj 

total spatial distance from to Si. On the other hand, 
{p — is a lower bound on the total spatial 

to qx from any {p — |S'/|) candidates in Sr. Therefore, 

^ tr E£1 i ds.,s,) +{p- \Si\)dx 


is a lower 


Fig. 9: Inner-Triangle Distance Pruning. 

any activity location within By never produces a better 
solution by incorporating any {p — |S'/|) candidates into Sj, 
and By thus can be safely pruned. □ 

Compared with Lemma[TJ Lemma |2] further aggregates the 
distance computation by including multiple balls in BallTree. 
Outer-Triangle Distance Pruning is performed each time 
Si is expanded. Note that J2sies, ‘^si,ctr{B^) is computed 
when accessing ball Bx. Therefore, Outer-Triangle Distance 
Pruning is able to prune multiple balls without recomputing 
^^siGSj dsi,ctr{Bx) oach time. 

Inner-Triangle Distance Pruning (ITDP). Outer-Triangle 
Distance Pruning derives the distance lower bounds based 
on the distance from Si to another previously-calculated 
activity location. On the other hand, the idea of Inner- 
Triangle Distance Pruning is that, when the attendees in Si 
are sparser, the total spatial distance from Si to some ac¬ 
tivity locations may also increase. Therefore, Inner-Triangle 
Distance Pruning removes redundant activity locations by 
deriving the lower bounds of the total spatial distance from 
attendees in Si to activity locations, based on the spatial 
distances of attendees in Si. 

9(a)| shows a case where Si contains three atten- 


bound on the total spatial distance from any set of p candi¬ 
dates expanded from Si to the activity location qx. Therefore, 
if the condition of Irmer-Triangle Distance Priming holds, the 
total spatial distance from any set of p candidates expanded 
from Si to qx must equal to or exceed D. □ 

Note that \Si\ — 1 must be included in the denominator 
to prevent overestimation of duplicated distance ds^^sy In 
addition, the first term can be constructed incrementally as 
Si expands, which does not require recomputation at each 
iteration. Therefore, Irmer-Triangle Distance Pruning can be 
performed efficiently. 

It is more efficient to trim off multiple urmecessary activity 
locations all together. Since SlS+i is 

a lower bound on the total spatial distance from Si to a 
point (the center of ball Bx of locations), we can subtract 
this term with \Si\ ■ r{Bx) to obtain a lower bound on the 
total spatial distance from Si to any location in Bx, as 
shown in Figure |9(b)[ Moreover, similar to OTDP, we can 
replace {p — \Si\)dv^^^^q^ in Lemma |3] by its lower bound 
{p — IS'/I) M'lNDIST{Mi, Bx). Therefore, given a 

ball Bx, all activity locations within Bx can be safely pruned 
according to the following lemma. 


Lemma 4: If 


f 1 y^IN|-i 


Figure 


dees. In this case, the distance among each pair of attendees 
in Si, i.e., dg^y^ (solid lines) is used to derive a lower bound 
on the total spatial distance from Si, Sj to qx (dott ed lines), 
i.e., dg.^q^ + dgyq^ > Therefore, Figure [9^ shows a 


set of lower bounds on the spatial distance from Si to any 
location qx. 1 ) dg^^q^ -\~ds 2 .qx ^ d. 


Sl,S 2 r dgi,qx~Lds2^qx > d, 


'Sl,S3/ 


and 3) dsa,?* +ds 3 .qx > dg^^g^. Summing them up, we have a 
lower bound on the total spatial distance from Si to qx, i.e.. 




^ V'P/l-i Y^IS/l 


On the other 


hand, since MAGS needs to move other p— |S'/| attendees to 
Si, a lower bound on the total spatial distance from them to 
Qx is i\p-\Si\)-dy^,^^q^, where dy^,^^q^ denotes the minimum 
spatial distance from qx to any candidates in Sr. Therefore, 
Inner-Triangle Distance Pruning is specified in the following 
lemma. 

Lemma 3: If (isTpr E'=i+i + (p - 

^min jQx 


JS/I- 

\^i\)dv„,i„,qx > D holds, qx never produces a better 
solution for any set of p candidates expanded from Si. 

Proof: The first term of the above inequality is a 
lower bound on the total spatial distance from Si to 
qx, and the second term is a lower bound on the total 
spatial distance to qx from any {p — IS"/!) candidates in 
Sr. From triangular inequality, we have dg^^q^ + dg^q^ > 
dgi,gy VI < i,j < |S'/| and i ^ j, such as + 

ds 2 ,qx ^ dgi^g 2 and dg^^q^. dg^^q^ > dg^y^ — dg^^g^ ^s 
shown in Figure |9(a) Therefore, {\Si\ — l)X]lSds 7 ,q^ -I- 
i\Si\ - l)El=ldgyq^ > 2 . where 

(l^/l-l) Elid'd. 


= i\Si\-l)T,\=ldg 


Consequently, 


|Si|-l rL^i=l rL^j=i+l ~ I ■ "^{ddx) + 

{p — \Si\)m\o.MieURMINDIST\Mi,Bx) > D holds, any 
activity location within ball Bx never produces a better 

solution expanded from Si. _ 

Proof: As illustrated in Figure 9(b)| and pointed out 

in Lemma |3 (isjpi is a lower 

bound on the total spatial distance from Si to ctr{Bx). 
I'S'/I • r{Bx) is necessary to be incorporated to ensure 

that, f]s7pTEl=i“^Ei=i+id3 ,.Y) - \Si\ ■ r{Bx) is a 
lower bound on the total spatial distance from Si to 
any activity location within Bx. On the other hand, 
{p — \Si\)m\nMi^UR^dINDIST{Mi,Bx) represents a lower 
bound on the total spatial distance from {p—\Si\) candidates 
in Sr to Bx. Therefore, when the above condition holds, any 
activity location within Bx never produces a better solution 
with any group of p candidates expanded from Si. Thus, Bx 
can be safely pruned. □ 

Activity Location Distance Pruning. Activity Location 
Distance Pruning exploits the MBRs in R-Tree and the balls 
in BallTree, to quickly filter out unqualified activity locations. 

solution value D and ball Bx, Activity Location Distance 
Pruning jointly considers a lower bound from the attendees 
in Si to Bx and a lower bound from {p — |S'/|) remaining 
candidates in Sr to Bx. If the sum of the two lower bounds 
exceeds D, it concludes that S'/ and all activity locations 
within Bx never produce a better solution with the total spa¬ 
tial distance smaller than D. Specifically, Activity Location 
Distance Pruning is based on Lemma 0 below. 

Lemma 5: If MINDIST(si,Bx) + {p — 

\Si\)Tah\Mi^UR^dINblST{Mi,Bx) > D holds, the activity 
locations within ball Bx do not produce a better solution 
than the current solution corresponding to D. 
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Fig. 10: Activity Location Distance Pruning. 

Proof: As shown in Figure ITOl MINDIST{si,Bx) = 
dsi,B^ —riBjf) is a lower bound on the distance from Si to any 
activity location within B^. Thus, MINDIST{si, B^;) 

represents a lower bound on the total spatial distance 
from Si and any activity locations within Ball B^, and 
{p — \Si\)mmMiGUR MINDIST{Mi, Bx) represents a lower 
bound on the total spatial distance from (p — jS'/l) remaining 
candidates to any activity locations in B^. Therefore, when 
the above condition holds, any activity location within Bx 
never produces a better solution when any (p — l^/j) can¬ 
didates are selected into Sj. Therefore, Bx can be safely 
pruned. □ 

Although the above strategy is simple, it still incurs high 
computation overhead because \Si\ distance computations 
performed for each ball Bx to find the lower bound. In other 
words, the above strategy incurs \Si\ ■ n distance computa¬ 
tions, where n is the number of balls. In the following, there¬ 
fore, we propose two strategies which utilize the information 
of activity locations and the relationship of attendees in Si 
to reduce the number of distance computations. 

Here we briefly analyze the above distance pruning strate¬ 
gies. Let m denote the number of distance computations for 
(p — \Si\) ■ miiiMiCf/K MIN DI ST {Mi, Bx). Activity Location 
Distance Pruning incurs the highest computation overhead, 
i.e., (n- |S'/|-|-m), as distance computations are required for n 
balls (for each ball, it derives MINDIST{si, Bx),'dsi G Si). 
On the other hand, Outer-Triangle Distance Pruning incurs 
([S'/ 1 -I- n -I- TO — 1) distance computations, for n balls in 
the worst case, including \Si\ computations for the total 
spatial distance from Si to ctr{Bx), and (n—1 ) computations 
for the distances from ctr{Bx) to the centers of the other 
(n — 1) balls. Similarly, when deriving the lower bound on 
Si and Bx, Inner-Triangle Distance Pruning only considers 
the distances between each pair of attendees in Si, which 
can be computed incrementally and cached in early stages. 
Therefore, each time a new attendee is added to Si, Inner- 
Triangle Distance Pruning performs (|S'/I — 1 -I- to) distance 
computations for n balls. Therefore, Outer-Triangle Distance 
Pruning and Irmer-Triangle Distance Pruning are much more 
efficient than Activity Location Distance Pruning. 

5.6 Discussions 

User interests and existence of sponsors. We propose a 
generalized model to support the scenarios in terms of user 
interests. Let denote the interest measure (i.e., how an 
individual v prefers a candidate location q) of a person 
V in an activity to be held at location q. A small interest 
measure implies that v highly prefers the activities to 
be associated with q. Similar to the spatial radius constraint 
in SSGQ and MRGQ, a new interest constraint rjy^q < h is 
added to the two problems, where h denotes the interest 
threshold of an activity. For a candidate member v that prefers 
only karaoke studios and bars, the interest measure from v 
to coffee shops will be set to a large value exceeding the 


threshold. Thus, v in this case will never be selected for an 
activity in q. 

MRGQ and SSGQ can also flexibly handle the case when 
sponsors of the activity exist. Here, we describe a gener¬ 
alized graph model for the scenarios with sponsors. The 
sponsors are represented by a set S of new nodes in SSGQ 
and MRGQ. Here each sponsor s in S' is cormected to a 
person r; if s is correlated to v, e.g., v is an employee, a 
former student, or a regular customer of s. This link informa¬ 
tion can be acquired from the address directories, personal 
Facebook profiles, or customer databases. To support SSGQ 
and MRGQ with sponsors, the set S is added to the solution 
at the begirming of SSGS and MAGS. As such, these two 
algorithms will automatically find a solution group with 
correlation to S (i.e., the attendees that S would like to 
sponsor). Moreover, if the activity locations are provided by 
a sponsor, such as a chain restaurant, all branches of the 
chain restaurant group can be included in the candidate 
location set Q. Note that the group size p needs to be 
increased by jSI in the scenarios with sponsors, and the 
representatives of each sponsor can also he initially added 
to the solution group. 

Dynamically changing user locations. To index user lo¬ 
cations, one approach is to employ the fundamental R- 
Tree. If user locations are changed frequently, however, 
this approach is likely to incur frequent index updates or 
otherwise record outdated and inaccurate information. A 
more promising approach is to exploit R-Tree extensions that 
are specifically designed for dynamic environments, such as 
Time-Parameterized R-Tree (TPR-Tree) [22] or an improved 
version of TPR-Tree, TPR’^-Tree [23]. Similar to conventional 
R-Trees, TPR-Tree adopts Minimum Bounding Rectangles 
(MBR) to hierarchically index the spatial objects (i.e., the lo¬ 
cations of the users). However, instead of recording objects' 
locations at individual timestamps, TPR-Tree incorporates 
the velocity of each object to predict their upcoming posi¬ 
tions, and thus updates are only triggered when the velocity 
changes. This strategy significantly reduces the number of 
updates. In other words, the MBR of an object or a tree node 
is a function of time. 

More specifically, each dynamically changing location of 
a user in the TPR-Tree is represented as 1) an MBR that 
denotes its extent at some reference time (a system param¬ 
eter), and 2) its current velocity vector. The velocity vector 
of an MBR is represented by the largest velocity of an object 
within the MBR in each direction. This strategy ensures 
that the MBR always encloses the underlying objects in the 
future. With the above information, the future MBRs and the 
objects' locations are not stored explicitly, but are quickly 
computed based on the locations at the reference time and 
the velocity vectors. Figure [TT] presents an example of the 
MBR and velocity vectors of three objects, i.e., a, b, and c. 
The velocity vector of each object is shown as solid arrows, 
and the velocity vector of the MBR is represented by dashed 
arrows. Figure |ll(a)| presents the locations of the objects 
and the enclosing MBR at reference time 0. In the next time 
slot shown in Figure |ll(b)| the locations of the objects are 
changed, and the enclosing MBR is enlarged to enclose all 
objects. As shown in this example, TPR-Tree only stores 
the locations of the objects and the corresponding MBR at 
the reference time, and the locations of the objects with the 
enclosing MBR in the future can then be computed with the 
velocity vectors. 

To support dynamic environments in this paper, we re- 
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Fig. 11: An example of the MBR and objects with velocity 
vectors. 



place R-Tree with TPR-Tree. Similar to the TPR-Tree problem, 
a parameter t/ > 0 is included in SSGQ and MRGQ to ensure 
that the optimal group is returned for d 5 mamic environ¬ 
ments from now to tf time units afterward. The ordering 
and pruning strategies of SSGS and MAGS are performed 
in the same way, except that the extents of the MBRs and 
the user locations have to be calculated before SSGS and 
MAGS. On the other hand, the BallTree in this paper only 
indexes the activity locations (e.g., restaurants, theaters, etc.) 
and thus does not need to be updated frequently. 

6 Tractability of MRGQ in Threshold 
Graph 

In Graph Theory, analyzing the tractability of NP-Hard 
graph problems in special graph classes is very important 
for theoreticians, e.g., [16][17]. Therefore, in addition to the 
inapproximibility of MRGQ analyzed above, we also prove 
that MAGS can find the optimal solution in polynomial time 
in a special graph class, namely threshold graph, which is 
defined as follows [15]. 

Definition 1: Given a threshold r and a graph G = {V,E) 
with a non-negative weight Wy for each vertex v € V, G is 
called a threshold graph if w{U) = J2veu ^ r holds for 
every subset U C V with every two nodes in U sharing no 
edge between them. 

Let degciv) denote the degree ot v . A degree partition 
D{V) of V divides V into m + 1 non-overlapping subsets 
Z?o,Di, -.jOrn, i-e., Each subset Di includes all 

the vertices with degree Si, and Si > Sj if i > j. When t = m 
and Wy = dega{v), we exploit the following theorem in the 
literature [15] to prove that MAGS can obtain the optimal 
solution in polynomial time in a threshold graph. 

Theorem 4: [15] Let G = {V, E) be a threshold graph with 
degree partition D{V). For every pair of vertices u & Di and 
V G Dj, u is connected to z; in G if and only if z -i j > m. 
Therefore, every vertex in Di shares the same neighbors. 

We first describe the tie-breaking strategy for Socio-Spatial 
Ordering in MAGS. Specifically, for any set V C Sr of 
verfices wifh the same spatial distance to each query point gi, 
the tie-breaking strategy checks if the vertices in V satisfy Eq. 
(1) in descending order of their vertex degrees. Therefore, the 
vertex v G V with the maximum dega{v) is first examined 
by Eq. (1) and moved into Si if v satisfies Eq. (1). Moreover, 
we employ a pre-processing strategy to remove unqualified 
vertices from the input graph G, which works both when G 
is a general graph or G is a threshold graph. Given the social 
network G with parameters k and p, the pre-processing 
strategy iteratively removes from G every vertex v with 


fewer thanp —fc —1 neighbors, along with v's incident edges. 
The iteration repeats until no more vertices can be removed 
and produces a graph G. Any removed vertex v carmot form 
a feasible graph with other vertices in G, because even when 
all the neighbors of v (i.e., Ny) are included in the same 
group E, there is still at least one vertex in Ny U {z;} with 
fewer than p — k — 1 neighbors in F. The above process is 
called core decomposition, and the remaining graph G is a 
maximal {p — k — l)-core [18], where G C G is the largest 
graph such that each vertex has at least p — k — 1 neighbors 
in G. The pre-processing strategy can be done in 0(|£'|) time 
[19]. Intuitively, if |G| < p, then there exists no solution to 
MRGQ. 

Note that Dj is the degree partition on G, instead of G 
after the core decomposition. Let V{G) denote the set of 
verfices in G, and let j denote the minimum j such that 
Dj n V{G) 7 ^ 0 . In other words, Do,Di, ..,D'j_-^ are all 
removed during core decomposition. The following theorem 
first compares G and G. 

Theorem 5: Eor input threshold graph G and the graph G 
after core decomposition, {JfL-jDi — ^(G) holds, and the 
neighbors of every vertex in Di are also the neighbors of 
every vertex in Dj if i < j. 

Proof: Apparently, \JfLjDi D V{G) since G C G, and 
thus we prove that [Jff-jDi C V{G) by contradiction. As¬ 
sume u G Dy for some r > j, but u ^ V (G). Given a vertex x 
in D'jHV (G) and any vertex y gV (G) sharing an edge with 

X, if y G Ds, j < s < m, according to Theorem |4| j + s > m 
must hold; otherwise, x and y would not share an edge. 
Since j is the minimum number such that Dj n V (G) 7 ^ 0 , 
r + s > m must hold, implying that u has an edge with 
y because r > j. Erom the definition of threshold graph, 
u also share edges with all neighbors of x. Therefore, the 
number of zi's neighbors in V (G) is no smaller than x does, 
and u should not be removed by the pre-processing strategy 
because x € n I^(G). Therefore, this contradiction proves 
that {SfEjDi = V{G) holds. Moreover, since vertex u in Di 
is connected to w in Dn if and only if z -i n > to according 
to Theorem m vertex v in Dj is also connected to w in if 
j > i because j-irz>z-|-n>TO. Therefore, fhe neighbors of 
every vertex in Di are also the neighbors of every vertex in 
Dj if i < j. □ 

In other words, core decomposition only trims the vertices 
with smaller degrees, and if any vertex in Di is removed, 
the above theorem manifests that all the other vertices in 
Di will be pruned as well. According to Theorem |4| every 
vertex in Di still shares the same neighbors. In the following, 
we provide the theoretical result for MAGS in Theorem | 6 | 
In Graph Theory, since each vertex z; of a threshold graph 
is not associated with a spatial distance to qi G Q, we first 
assume that dy^q^ = 1 for Vz; GV,qi G qH 

Theorem 6: Let G = (y, E) be a threshold graph with 
degree partition V = DqUDiU- ■ -UDm and (3 = max{z : \Di\ + 
• • • -I- \Dm\ > p}- Given an MRGQ{p,Q,k,t) for G, MAGS 
stops (either returning the optimal solution or returning no 
solution) in polynomial time if dy^q^ = l,\/v G V, qi G Q. 

Proof: We employ All-Pair Distance Ordering for MAGS 
here. If after core decomposition, the resulting graph has 

6. The results are theoretically interesting since no social network belongs 
to a threshold graph. On the other hand, the hardness result for a general 
social network is presented in Theorem 1. 
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fewer than p vertices, i.e., IGj < p, there is no feasible 
solution, and MAGS stops in 0{\E\) time (i.e., only the 
core decomposition is performed). Otherwise, if |G| > p, we 
prove that there exists at least one feasible solution, and we 
prove the theorem by examining two different cases for [3: 

d> LfJ and/3< LfJ- 

(1) /3 > . We prove that MAGS can find the optimal so¬ 

lution by generating only p nodes in the branch-and-bound 
tree. From Theorem |4] and Theorem |5l every two vertices u 
and V in Ui^[™j+i are cormected by an edge in G. Since 
I I ■ -I- 1 Djn I > p, the first p vertices selected and examined 
by Socio-Spatial Ordering and the tie-breaking strategy must 
belong to Di and form a path of length p from the 

root in the branch-and-bound tree. Since every two vertices 

^re connected according to Theorem IH 
F is a feasible group. In the p-th node generated in the 
branch-and-bound tree, (F, q^e/) is a feasible solution in G, 
where g^e/ can be derived by All-Pair Distance Ordering. 
Because = 1 here, Vi; G V,qi G Q, {F,qref) is the 
optimal solution, i.e., q* = qref- Afterwards, the distance 
pruning strategies of MAGS are performed exactly p times 
(one time for each branch-and-bound node except the leaf 
node, and one time for the root of the branch-and-bound 
tree) to conclude that no further search is required. Since 
(F, q *) is the optimal solution in G, it will also be the optimal 
solution in G due to dy q. = l,\/v G V, qi G Q. 

(2) ^ < Lf J- In this case, we prove that MAGS can also 
find the optimal solution by generating exactly p nodes in 
the branch-and-bound tree. In MAGS, Socio-Spatial Order¬ 
ing and the tie-breaking strategy first move the vertices in 
ur=/ 5 +i Di into Si (vertices in Di are moved into Si earlier 
than those in Dj if i > j), and then move the rest p — IF/I 
vertices from Dp to Si to construct the group F. Moving 
these p vertices into Si creates the first p nodes in the branch- 
and-bound tree and builds up a path of length p from the 
root. 

In the following, we prove that F is a feasible group in 
G. We first examine the vertices in F that are drawn from 
(J Di. Let V G F and v G UiJ /3 According to Theorem 
|4] and Theorem all neighbors of z; in G must belong to 
Since G is a {p — k — l)-core, and the extracted 
subgraph F contains all the vertices in (J™ Di, v must 
have at least p — k — 1 neighbors in F. By contrast, for every 
vertex u G F drawn from IJ™ |^m Di, since the neighbors 
of every vertex in Di with z < are also the neighbors of 
every vertex in Dj with j > according to Theorem |5l 
u must have no fewer neighbors than that of z; in F, where 
z; G F is any vertex drawn from Di. From the above 

description, a vertex v G F drawn from either lJi=/ 3 ^ 

Di must have at least {p — k — 1) neighbors in F. 
Therefore, F is a feasible group. Similar to case (1), (F, g^e/) 
is the optimal solution, where qref is obtained from All- 
Pair Distance Ordering. Then, the distance pruning strategies 
conclude that no further search is needed, and MAGS stops. 

Time Complexity. In the following, we analyze the de¬ 
tailed time complexity of constructing these p nodes. The 
pre-processing strategy takes 0(|F|) time. Before each of the 
p nodes in the branch-and-bound tree is constructed, MAGS 
extracts Vc G Sr and qref G Qi from the lists Ur and Ur- 
During the examinations of Ur and Ur, when the MBR Mi G 
Ur and ball Bj gUr satisfying the score function in Eq. (7) 
are identified, each of the three distance pruning strategies 


in Sec. 5.5 is performed once. In the worst case, each time 
when APDO obtains Vc and g^e/, each internal node of R- 
Tree and BallTree is accessed. Therefore, max{|R|, |(5|} times 
of extracting Mi and Bj satisfying Eq. (7) are performed, 
and max{|R|, IQII times of each distance pruning strategy 
is also performed. Since extracting Mi and Bj satisfying Eq. 
(7) takes OdS/HRIigi) = 0{p\V\\Q\) time, and OTDP, ITDP 
and ALDP each takes 0{\Q\) time, the time complexity for 
MAGS to extract each pair of Vc G Sr and qref G Qi is thus 
0(max{|R|, |(5|}(p|l/||(5|)). Therefore, for the p nodes in the 
branch-and-bound tree, it takes 0{p^ max{|R|, |(5|}(|R||(5|)) 
time for extracting Vc and qref and performing the distance 
pruning strategies. 

On the other hand, for the p nodes in the branch-and- 
bound tree, p times of Eq. (1) checking and tie-breaking 
strategy are performed, which takes 0{p^ -I-p|R| log |R|). 
Also, p times of Eamiliarity Pruning (Eqs. (4), (5)) are 
performed, which takes 0{p\V\^). Moreover, p additional 
times of Vc and qref extractions (with distance priming 
strategies) are performed to conclude that no further search 
is needed, which takes 0(p^ max{|17|, |(5|}(|R||(5|)) time. 
Therefore, the overall time complexity of MAGS is 0(|F|)-l- 
0{p^+p\V\\og\V\)+0{p\V\^)+2-0{p^miix{\Vl\Q\}{\V\\Q\)). 
Since \V\ = 0(|t7|), where G = {V,F) is the input graph 
before pre-processing. Therefore, the time complexity is 
0(|F|+p2niax{|R|,|g|}(|R||Q|)). □ 

7 Experimental Results 

We implement SSGQ in Pacebook and recruit 206 people 
from various backgrounds (e.g., students, and public and 
private sector workers) to compare solution quality and time 
overhead for answering SSGQ and MRGQ via manual coor¬ 
dination and our proposed algorithms. Each user completes 
24 SSGQ tasks and 20 MRGQ tasks with the social graphs 
extracted from their social networks in Eacebook, together 
with their spatial locations sampled from their Pacebook 
Checkin records. 

In addition to the real dataset collected from the 206 
study participants, we evaluate the performance and the 
solution quality of SSGS, SSGMerge (the heuristic algorithm 
for SSGQ mentioned in Section HJl and MAGS using a large 
real dataset, DataSet_4SQ, obtained by crawling Poursquare 
l20l , one of the most representative LBSNs, for a month. 
DataSet_4SQ contains both the social and spatial information 
of 153,577 individuals. Moreover, we also compare MAGS 
with two relevant algorithms, namely Geo-Social Circle of 
Friend Query (gCoFQ) fldl and p-Nearest Neighbor (pNN), to 
evaluate the solution quality and performance. In addition 
to DataSet_4SQ, we also evaluate MAGS for MRGQ on a 
large real dataset, DataSet_Youtube El . which is a social 
network extracted from Youtube video-sharing website with 
1,134,890 individuals. The activity location g for SSGQ and 
Q for MRGQ are randomly selected from DataSet_4SQ, and 
we measure 50 samples in each scenario. 

7.1 User Study 

We perform the user study with 24 and 20 tasks for SSGQ 
and MRGQ, respectively. The 24 tasks in the user study of 
SSGQ span various p and network sizes, where the spatial 
radius, t is fixed to 10 km. Different k are assigned in the first 
12 tasks, while k is not specified in the other 12 tasks, to let 
each user freely select p people for finding out the familiarity 
preferred by each person in activities with different p. 
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Fig. 12: Results of user study. 


Figures IT^ al-ff) compare manual coordination, SSGS and 
SSGMerge to answer SSGQ in the user study. Figure IT^ a) 
presents the time to find the solutions in different scenarios. 
The result indicates that SSGQ is challenging for manual 
coordination, especially for a large nefwork size. In confrast, 
SSGS obtains the optimal solution with less than 0.01 second, 
while SSGMerge obtains a near-optimal solution with much 
less time. Figure IT^ b) with p = 5 and k = 3 demonstrates 
that the solutions from manual coordination require larger 
spatial distance and thereby are not optimal. With a larger 
network size, i.e., more friends nearby, it is easier to find a 
group of atfendees with a smaller total spatial distance to 
q. In addition, the solution quality in Figure IT^ c) shows 
that even in p = 5, the solutions obtained by manual 
coordination is not guaranteed to follow the familiarity 
constraint, according to the correctness rate shown in Figure 
HHc), because it is very challenging for a person to jointly 
minimize the total spatial distance and ensure the familiarity 
constraint. Moreover, the correctness rate drops dramati¬ 
cally as the network size increases. On the other hand, as 
shown in Figure IT^ b) and [T2l' c), SSGMerge obtains solutions 
which are very close to the optimal solution, this is because 
SSGMerge effectively utilizes the intermediate solutions to 
construct good solutions. 

In Figures HZjd) and H^e), we let each user freely select p 
people to find out the familiarity preferred by each person in 
activities with different p. The minimum k here represents 
the smallest k for each manual solufion to follow the fa¬ 
miliarity constraint. With this parameter extracted from the 
manual solution, we regard it as an input parameter for an 
SSGQ in the same social network. The results demonstrate 
that SSGS and SSGMerge can find betfer solufions following 
the same k with a smaller time. In other words, even when a 
user does not specify k, it is possible to analyze the previous 
manual coordination results and find ouf a suitable k for fhe 
user, such fhat SSGS and SSGMerge are able to find solutions 
in each query afterward. 

Figure IT^l f) with the network size as 15 and p as 9 
compares the results of differenf k. As k decreases, the 
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Fig. 13: Results of user study. 

correctness rate of manual coordinafion drops because it 
becomes more difficult for a user to find a fighter social 
group with the same number of aftendees. Moreover, the 
solution obtained by manual coordination is still worse than 
the solution of SSGS and SSGMerge even with a loose 
requirement on social cormectivity, i.e., a large k. 

These MRGQ tasks span various p, k, and \Q\, where t 
is fixed to 10 km. In the user study, MAGS is equipped 
with APDQ and all the proposed pruning strategies. We 
also compare the solution quality with an algorithm called 
GreedyManual (GM), which imitates the behavior of manual 
coordinafion. GM firsf finds the candidates within radius t of 
each activity location and picks the activity location which 
has the largest number of candidates nearby. Afterwards, 
if there exists a feasible group, GM returns it. Qtherwise, 
it repeats the above procedure with the remaining activity 
locations. 

Figures [13] compares manual coordination and MAGS to 
answer MRGQ in the user study. Figure ITSl' a) demonstrates 
that the solutions from manual coordinafion incur larger 
spatial distance and thereby are not optimal. When the num¬ 
ber of activity locations increases, it is easier to find a group 
of attendees and an activity location with a smaller total 

S itial distance. In addition, the correctness rate in Figure 
b) shows that even when p = 5, the solutions obtained 
by manual coordination are not guaranteed to follow the 
social constraint, especially for a smaller k, because it is 
very challenging for a user to jointly minimize the total 

S itial distance and ensure the social constraint. In Figure 
c), we let each user freely select 5 people and analyze the 
familiarity parameter preferred by each person in activities. 
The minimum k here represents the smallest k to meet the 
familiarity constraint in user selection. With this parameter 
extracted from the manual solution, we regard it as an input 
parameter for an MRGQ query in fhe same social network. 
The results demonstrate that users are difficult to handle 
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small k and large |Q| due to the need to examine many 
more combinations. Thus, the distances obtained by manual 
coordination are more deviated from the optimal solution 
obtained by MAGS. 

Figures FlSl' dl shows that as p increases, the correctness rate 
and solution quality of manual coordination significantly 
deteriorate because it becomes more difficult for a user to 
find a tight social group. In Figure [T^ el. each user can 
freely select any number of people for forming the group 
with k = 3. Manual p in this figure indicates the average 
group size measured in the user study. As \Q\ increases, 
the selected group size drops because it becomes more 
challenging to find fhe optimal group. Moreover, users need 
much more time to find the group when \Q\ grows, even 
with a small group size and a loose requirement on the 
social connectivity, i.e., p = {4,5} and fc = 3. Finally, Figure 
[131' fl presents the time spent to find the solutions in different 
scenarios. The result indicates that MRGQ is challenging 
for manual coordination, especially for a large number of 
potential candidate locations. 

Figure [131' gl compares the computation time and solution 
quality of GreedyManual (GM) and MAGS. Although GM 
obtains the solutions within a smaller time, the solution 
quality is much worse than MAGS. This is because GM stops 
when a feasible group is obtained, which cannot effectively 
obtain the optimal solution. 

Since MRGQ can also consider the user interests (dis¬ 
cussed in Section |5]6j, we also compare the user satisfaction 
with or without considering user interests. We ask the users 
to choose 20 activity locations in MRGQ, where each location 
is tagged as coffee shop, restaurant, bar, etc. The interest 
measure of each activity location qi to each user u is specified 
as r]u,qi between 0 and 1 by the user. We let each user 
compa re the groups selected by MAGS and the user herself. 
Figure |13(h) with p = 7 and fc = 3 compares the user 
satisfaction with and without user interests incorporated. 
The results manifest that 68% and 75% of the users agree 
that the groups selected by MAGS outperform the manually 
selected groups before and affer incorporating the inter¬ 
ests, respectively. Moreover, the increment of the users that 
choose "Better" after incorporating the user interests mainly 
come from those who previously chose "Acceptable". The 
results demonstrate that incorporating user interests indeed 
improves the user satisfaction. 

7.2 Performance Evaluation of Proposed Algorithms for 
MRGQ 


We evaluate the effectiveness and efficiency of the proposed 
algorithms for MRGQ. APDQ and SRDQ denofe MAGS with 
All-Pair Distance Qrdering and Single-Reference Distance 
Qrdering (a simplified version of APDQ, which is mentioned 
in Section 15.411 , respectively, while Socio-Spatial Qrdering 
and Familiarity Pruning mentioned in Section 15.11 are also 
included. In our experiments, unless specifically indicated, 
we set k = 4, p = 8, |Q| = 10,000, and the maximum value 
of t is 15 km. 

Figure [14] first compares MAGS in MRGQ with the related 
works on DataSet_4SQ, where MAGS is equipped with 
APDQ, Socio-Spatial Qrdering and the proposed pruning 
strategies. 1) Geo-Social Circle of Friend Query (gCoFQ) fT3l 
aims to find a group of p people fo minimize the linear 
combination of the social diameter and spatial diameter 
(maximum spatial distance between each pair of group 
members) of the selected group. In other words, there is 
no activity location in gCoFQ. In the experiments, gCoFQ is 
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Fig. 14: Comparisons with relevant works on DataSet_4SQ. 
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Fig. 15: Performance comparisons on DataSet_4SQ. 


implemented to limit the spatial diameter within 2t, and the 
nearest activity location after gCoFQ identifies the group is 
returned as the solution. On the other hand, 2) pNN extracts 
the group of p members along with their nearest activity 
location without considering the familiarity constraint. Fig¬ 
ures |14(a)| and |14(b)| compare the computation time and 
solution quality. Although pNN obtains the group with 
the minimum time and distance, as shown in Figure |14(c)| 
the minimum k of the obtained group (i.e., the minimum 
number of unfamiliar members each attendee has in the 
group) is far from the specified k value, i.e., fc = 4. In 
other words, the solution returned by pNN is not feasible 
to MGRQ. The solution quality of gCoFQ is worse than 
the other two algorithms because gCoFQ does not examine 
activity locat ions during the group formation process, while 
Figure |14(c)| shows that gCoFQ is also difficult to follow 
the familiarity constraint. In contrast, MAGS follows the 
familiarity constraint and can identify the opt imal g roup 
along with the nearest activity location. Figure |14(d)| com- 
pares the social diameter of the groups obtained by gCoFQ 
and MAGS, and the results of pNN are nof able to be 
displayed because the groups obtained by pNN are usually 
discormected. This figure manifests that, although MAGS 
is not designed to minimize the social diameter, the social 
diameter is still close to gCoFQ. 

Figures [T5| eval uates the efficiency of MAGS on 
DataSet_4SQ. Figure |15(a)| compares the computation time 
of fhe proposed algorithms with different values of t. Given 
its massive search space, SSP incurs the largest computation 
time as t grows. Qn the other hand, equipped with Socio- 
Spatial Qrdering, BallTree, Distance Priming, and Familiarity 
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Fig. 16: Performance comparisons on DataSet_4SQ. 


Pruning, SRDO and APDO effectively reduce the time to 
acquire the optimal solution and outperform Integer Linear 
Programming (ILP). Moreover, APDO requires the minimum 
computation time because it can effectively minimize the 
total spatial d istance from Sj to qref during each expansion 
of Sj. Figure |15(b) compares different algorithms with or 
without BallTree. Both SRDO and APDO require smaller 
computation time with BallTree and outperform ILP, since 
the proposed Outer-Triangle, Inner-Triangle and Activity 
Location Distance Pruning strategies are able to effectively 
remove redundant activity locations (within balls) at early 
stages. _ _ 

Figures [i5(c)| and |15(d)| present the impact of the proposed 
pruning strategies, i.e., Outer-Triangle Distance Pruning 
(OTDP), Irmer-Triangle Distance Pruning (ITDP), Activity 
Location Distance Prunir^ (ALDP), and Familiarity Pruning 
shown in Eqs. lll]l and ||5| in Section 15.11 (denoted as SP_1 
and SP_2). The results manifest that these pruning strategies 
effectively process the spatial and social relationships and in¬ 
deed are critical for efficiently processing MRGQ. Moreover, 
the first Familiarity Pruning (SP_1) is more powerful than 
the second one (SP_2) since it derives a tighter upper bound 
on the number of people acquainted with each member in 
Si. 


7.3 Performance Evaluation of Proposed Algorithms for 
SSGQ 

We set the spatial radius, t, as 10km in this set of ex¬ 
periments, which is determined based on the user study. 
The study indicates that most of the users are willing to 
participate in impromptu activities within 10 km from them. 

For SSGMerge, we empirically set A within the range 50 < 
A < 800, because we observe that in this range, SSGMerge 
incurs small execution time while the obtained solutions 
achieve significant improvement over the straightforward 
approach, i.e., i-th feasible solution. On the other hand, w 
should not be set too small, e.g., smaller than 10000 , because 
the intermediate solutions in this case will not be able to 
include a sufficient number of distinct candidates, and thus 
limiting the possibility for constructing good solutions. 

Figures irM a) and Fl^ b) analyze the proposed strategies 
in Section 4, where SO, DP and FP denote Socio-Spatial 
Ordering, Distance Pruning and Familiarity Priming, re¬ 
spectively. The result indicates that Socio-Spatial Ordering 
(SO) is effective in reducing the execution time. This is 
because Socio-Spatial Ordering considers both spatial and 
social domains and thereby is able to guide the efficient 
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Fig. 17: Comparisons of the proposed algorithms with ILP 
for SSGQ. 
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search of the feasible solutions and the optimal solution 
by exploring fewer states in the branch-and-bound tree, as 
shown in Figur e |16(b)| 

Figures |16([c)| and 16(d)| com pare SSGS with different pa¬ 
rameter settings. Figure |16(c) indicates that the execution 
time increases as p grows, because SSGS in this case needs 
to explore a larger search space to find the optimal solution. 
On the other hand, Figure [l 6 (d)| shows that a larger k leads to 
a smaller execution time because it becomes easier to obtain 
feasible solutions for Distance Pruning to trim the search 
space. 


7.4 Comparisons with ILP for SSGQ 

Figure [17] compares the performance of Integer Linear Pro¬ 
gramming (ILP) with SSGMerge and SSGS. In our experi¬ 
ments, a renown general-purposed commercial parallel op¬ 
timizer, CPLEX [1], is adopted to find the optimal solutions 
with the proposed ILP formulation, while both SSGMerge 
and SSGS are single-thread programs. ILP here represents 
a baseline benchmark for examining the efficiency of the 
proposed algorithms. Although SSGS and SSGMerge run 
in single-thread, they still outperform ILP. This is because 
SSGMerge and SSGS carefully include effective pruning 
and ordering strategies. Moreover, SSGMerge exploits the 
structure of intermediate solutions. Therefore, SSGMerge 
achieves superior performance over SSGS and ILP. 


7.5 Performance Evaluation of SSGMerge 

Pigure [T8| compares the solution quality and execution time 
of SSGMerge with different settings, and the default value 
of w is set to 20k. In Pigure |18(a)| and |18(b)| to compare the 
solution quality of SSGMerge and SSGS, we first measure the 
execution time of SSGMerge and then stop SSGS with the 
same l ength of time, and denote this solution as SSGTimeCut. 
Pigure [T^a)| shows the ratio between SSGTimeCut and SSG¬ 
Merge, i.e., the total spatial distance of solutions obtained by 
SSGMerge divided by the total spatial distance of solutions 
obtained by SSGTimeCut, with different A. When A increases, 
SSGMerge can obtain better solutions since it will examine 
more intermediate solutions and is more inclined to extract 
a better one. Moreover, when t grows, the improvement 
from SSGMerge becomes more significant. This is because 
it becomes more difficult for SSGS to extract good feasible 
solutions at early stages, but SSGMerge can effectively merge 
existin g inte rmediate solutions to obtain good solutions. 
Pigure 18(b)| shows the execution time of SSGMerge. When 
t grows, the execution time increases slowly. This is because 
the size of the state sets is fixed and the extra computation 
of merging intermediate solutions incurred is thus limited. 

Also, we compare the solutions returned by SSG Merge 
with the op timal s olution s returned by SSGS in Pigures [l 8 (c)| 
and |18(d) Pigure |18(c)| displays the ratio of the optimal 
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Fig. 18: Experimental results of SSGMerge. 


solution and the solution obtained by SSGMerge, i.e., the 
optimal solution values divided by the solution values ob¬ 
tained by SSGMerge. The result manifests that the solutions 
obtain ed by SSGMerge are close to the optimal solution. 
Figure |18(d)| compares the execution time of SSGMerge and 
SSGS, where SSGMerge outperforms SSGS and the execution 
time of SSGMerge increases very slowly when t grows. 

Figure |18(e)| presents the solution quality of SSGMerge 
with different w. Here we set A as 200. The total spatial 
distance decreases when w grows, because with a larger w, 
SSGMerge can examine more distinct candidate attendees in 
different intermediate solutions, which enables SSGMerge to 
construct better solutions. 

In Figure |18(f)[ we compare the solution quality for dif¬ 
ferent p of SSGMerge and the straightforward approach 
with specified i, i.e., the i-th feasible solution in SSGS. We 
first measure the execution time of SSGS to obtain different 
feasible solutions and then set proper A to stop SSGMerge 
with the same length of time, while OP T rep resents the 
optimal solution returned by SSGS. Figure |18(f)| shows that 
SSGMerge can obtain solutions which are close to the opti¬ 
mal solution and outperform the straightforward approach. 
Moreover, although the solutions obtained by the straight¬ 
forward approach converge quickly when i increases, SSG¬ 
Merge can still obtain much better solutions. This is because 
SSGMerge each time combines a group with multiple atten¬ 
dees, which greatly reduces the time for expanding Si one 
by one. Moreover, Socio-Spatial Ordering ensures that the 
early expanded groups incur small total spatial distances. 
Therefore, SSGMerge can produce solufions which are close 
to the optimal solution. 

7.6 Comparisons of MRGQ with Reievant Works 

To compare with the state-of-the-art methods with different 
parameters, we have conducted more experiments by vary¬ 
ing k, \Q\, and p. The results are presented in Figure 
Figure |19(a)| compares the feasibility ratio (i.e., the ratio of 
the obtained solutions satisfying the familiarity constraint) 
of MAGS and the other relevant approaches with different 
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Fig. 19: Comparisons with relevant works with different 
parameters. 


k. The proposed MAGS achieves 100% of feasibility ratio 
with different k because the proposed Familiarity Priming 
strategy effectively trims all the intermediate solutions that 
will not satisfy fhe familiarity constraint at an early stage. As 
k decreases, the feasibility ratios of gCoFQ and pNN drop 
because pNN does nof consider the social domain, while 
gCoFQ is designed to minimize a linear combination of the 
social diameter and spatial distance of fhe group members. It 
is worth noting that the feasibility ratio of pNN is low and 
unacceptable even with a loose familiarity constraint (i.e., 
k — 6) because the groups returned by pNN are usually 
disconnec ted. _ 

Figures [l9(b)| and |19(c)| compare the solution quality and 
feasibility ratio of different approaches. When the number 
IQI of candidate locations increases, all approaches can find 
the solutions with smaller total spatial distances due to more 
potential good choices. Nevertheless, the proposed MAGS 
outperforms gGoFQ in terms of solufion quality because 
gGoFQ does not examine activity locations but only tries 
to minimize the maximum spatial distance between each 
pair of group members. Although pNN acquires the groups 
with the minimum spatial distances. Figure [T9(c)| manifests 
that the feasibility ratio of pNN is very small because the 
individuals who are closest to an activity location do not 
satisfy th e fami liarity constraint in most cases. 

Figure |19(d) presents the feasibility ratio with different 
p. Given the familiarity constraint fc = 4, the proposed 
MAGS always obtains the solutions satisfying the familiarity 
constraint (i.e., feasibility ratio is 100%). However, when p 
increases, the feasibility ratios of gCoFQ and pNN drop. 
This is because more group members are necessary fo he 
connected to each other when p is larger, and it is thus 
more difficult for gGoFQ and pNN to follow the familiarity 
constraint. Moreover, although gGoFQ is able to obtain the 
groups with small social diameters, a few group members 
are still unacquainted with many other members. Therefore, 
the feasibility ratio of gCoFQ is still not sufficient. 


7.7 Experimental Results for MAGS in Dynamic Environ¬ 
ments 

We perform experiments for MRGQ when the locations of 
the users are dynamically changing. We generate the user 
trajectories according to [22] as follows. We first extract from 
DataSetjtSQ the locations visited by each individual, and 
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Fig. 20: Comparisons of R*-Tree and TPR-Tree. 
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Fig. 21: Effectiveness of distance pruning strategies. 


each individual is placed in one of its visited locations in 
equal probability with maximum speed of 0.75 km/min. 
The destination of each user is randomly picked from other 
visited locations. During the first 1/6 of the route, users 
accelerate from zero speed to their maximum speeds. During 
the middle 2/3 of the route, they travel at their maximum 
speeds; and in the last 1/6 of the route, they decelerate. 
When a user reaches her destination, a new destination is 
assigned at random to her. 

We compare MRGQs with R*-Tree [24] and TPR-Tree 
[22] in our experiments by changing the ratio of moving 
users (i.e., a specific ratio of users are moving, and the rest 
are static), and measure the number of index updates for 
MRGQs in the two index structures. In addition, query exe¬ 
cution time measures the time to process each query with the 
two index structures. In our experiments, we issue MRGQs 
randomly at 30 different time slices within 90 minutes from 
the start time. The query parameters are set as t = dkm, 
p = 8, k = A, \Q\ = 10000, and tf = 0. 

Figure |20(a)| compares the number of index updates of 
R’^-Tree and TPR-Tree with different ratios of moving users. 
As the ratio increases, the number of updates grows rapidly 
for R’^-Tree. This is because when the number of moving 
users increases, more location updates occur, and R’^-Tree 
updates the users locations, splits MBRs, and rebalances the 
index structure more frequently. In contrast, the number of 
index updates of TPR-Tree is small as compared to R’*'-Tree 
because TPR-Tree incorporates the veloc ity vec tors for MBRs 
to avoid the frequent updates. Figure |20(b)| compares the 
query execution time of R’*'-Tree and TPR-Tree with different 
ratios of moving users. Execution time of the MRGQs with 
R’^-Tree and TPR-Tree are both small. The execution of 
MRGQs in R*-Tree is smaller than that in TPR-Tree because 
R’*'-Tree spends much time on updating the index structure 
and maintains smaller MBRs. The smaller MBRs in R’*'-Tree 
provide better index capability. In contrast, the execution 
time of TPR-Tree increases when the ratio of the number 
of moving users grows because in this case, the MBRs are 
larger and the index capability deteriorates. 

7.8 Performance Comparisons for MAGS with Different 
Pruning Strategies 

We compare SRDO, APDONoQTDP, APDQNoITDP, AP¬ 
DONoALDP, and APDO by changing parameters k and p. 
Pigure [21] presents the experimental results with the default 
parameters t = 9km, p = 8, k = 4, and \Q\ = 10000. When 
k or p increases, the computation time of APDO increases 
slowly. As shown, APDO outperforms the other approaches, 
i.e., APDONoALDP, APDONoOTDP, APDONoITDP, and 
SRDO because the proposed distance pruning strategies 
effectively avoid redundant activity location examinations. 
When k increases, many approaches except APDO incur 
more computation time. This is because when k becomes 
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Pig. 22: Sensitivity tests of MAGS on DataSet_Youtube. 


larger. Familiarity Pruning is less effective due to the loosen 
familiarity constraint. Nevertheless, the proposed APDO 
with all the distance pruning strategies is able to avoid 
redundant Sp expansions. Similarly, when p increases, the 
search space also increases. APDO with the pruning strate¬ 
gies can thus stop expanding Sp earlier. Therefore, APDO 
outperforms the other approaches. Finally, APDO explores 
many possible g^e/ and Vc during the expansion of Sp to 
obtain good solutions much earlier than SRDO does. In 
other words, the distance pruning strategies in APDO is very 
effective, which enables the proposed APDO to significantly 
outperform SRDO. 

7.9 Experimental Results for MAGS on 
DataSet_ Youtube 

We evaluate the proposed MAGS on DataSetJYoutube, which 
is a social network extracted from Youtube video-sharing 
website with 1,134,890 individuals. Since there is no spatial 
information for this dataset, we randomly assign the spatial 
coordinates to the individuals as her current location. In 
our experiments, we randomly extract the activity locations 
from DataSet_4SQ. In the following experiments, unless 
specifically indicated, we set fc = 4, p = 8, IQI = 10,000, 
and the maximum value of t is 15 km. 

Figure [22] evaluates the proposed 


DataSet_Youtube. Figure 
DataSet Youtube contains about 8 times 


shows 


algorithms on 
that, although 
the number of 


candidates as DataSet_4SQ does, the computation time of 
SRDO and APDO both incur small computation time. This 
is because DataSet_Youtube is socially sparse (i.e., with an 
average degree 5.27), which enables the social pruning 
strategies to effectively remove redrmdant search space. 
Figures l22[ bl-(dl compare APDO with different parameter 
settings. Since DataSet_Youtube contains fewer spatially 
dense clusters, the computation time is limited with the 
distance pruning strategies. Moreover, Familiarity Priming 
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is able to quickly prune unqualified groups due to the large 
number of low-degree nodes in fhe social graph, and fhe 
computafion time is thus reduced. 

8 Conclusion 

To address the need of automafic acfivify planning based 
on the social and spatial relationships of aftendees and 
activity locations, we define a new query, namely MRGQ, 
fo jointly find fhe optimal set of aftendees and the best 
activity location among multiple activity locations. We also 
study a special case of MRGQ, namely SSGQ, which only 
features a single activity location. We show that processing 
MRGQ is NP-hard and inapproximable within any factor. 
We formulate MRGQ with Integer Linear Programming and 
propose an efficient algorithm, namely MAGS. In addition 
to indexing the candidate attendees in R-Tree, we propose 
to index the candidate locations in BallTree, and devise 
various ordering and pruning strategies based on the social 
and spatial relationships. Experimental results show that 
the computation time required by single threaded MAGS 
is much smaller than using an IBM CPLEX parallel opti¬ 
mizer. Moreover, we show that the problem of processing 
SSGQ is NP-hard and devise an efficienf algorifhm, namely 
SSGS, to process SSGQ. Various strategies, including Dis¬ 
tance Ordering, Socio-Spatial Ordering, Distance Pruning, 
and Pamiliarity Pruning are proposed to prune redundant 
search space and obtain the optimal solution efficiently. We 
also implement SSGQ in Eacebook and conduct user studies 
for both SSGQ and MRGQ to demonstrate that the proposed 
algorithms significantly outperform manual coordination in 
terms of both solution quality and efficiency. 
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Appendix A 

Pseudo Codes of the Proposed Algorithms 

The pseudo codes of fhe proposed algorithms, i.e., SSGS, 
SEGP, MAGS with Single-Reference Distance Ordering, and 
MAGS with All-Pair Distance Ordering, are presented as 
follows. 
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Algorithm 1 SSGS algorithm 

Require: Graph G = (G, E), location G for each v €V, the number of attendees p, activity location q, familiarity constraint 
k, and spatial radius t. The user locations lv,'iv (HV are indexed by an R-Tree. 

Ensure: Optimal group F. 

1: Sj 0, curDist 0, F <— 0, D <r- oo, 0 k 

2: Employ R-Tree Range Query on q to find the vertices within distance t as Sr 
3: FlNDGROUP(S'/,S'_R,curi:)ist) 

4: a D ^ OO then 
5: output F 

6 : else 

7: output "No Answer" 

8 : end if 

9: procedure FlNDGROUP(mS'/,m-S'i?,c'urDist) 

10: Si <r- inSi, Sr iuSr 

11 : while \Si\ + |S'_r| > p do 

12: if there is any unvisited vertex in Sr then 

13: Employ R-Tree distance browsing to extract from Sr the next unvisited vertex u 

which has the minimum spatial distance to q; mark u as visited 
14: else a 6 < p — 1 then 

15: increase 6 and mark the remaining vertices in Sr as unvisited 

16: else 

17: break 

18: end if 

19: if u satisfies the condition of Socio-Spatial Ordering in Eq. (1) then 

20: Si Si + {u}, Sr <r- Sr — {u}, curDist -S— curDist + du,q 

21: if Familiarity Pruning in Eq. (2) or Distance Pruning in Eq. (3) is satisfied then 

22: break 

23: else if \Si\ < p then 

24: FlNDGROUP(S'/,S'_R,CUri:)ist) 

25: else 

26: D curDist, F Si 

27: break 

28: end if 

29: else if 6 = p — 1 then 

30: Sr <r- Sr — {u} 

31: end if 

32: end while 

33: end procedure 







Algorithm 2 SFGP algorithm 


Require: Graph G = {V,E), location G for each v € V, the number of attendees p, activity locations Q, familiarity 
constraint k, and spatial radius t. The user locations G) Vv € G are indexed by an R-Tree. 

Ensure: Optimal group F and the corresponding activity location q* 

1: Sj i — 0/ Si — G, I i — Qf F ^— 0/ D i — OOf 0 i — k 

2: find u G Sr and Qref G Qi such that u and qref are the spatially closest pair 
3: FlNDGROUPANDLOC_SFGP(S'/,5'K,(3/,gre/) 

4: a D ^ oo then 
5: output {F,q*) 

6 : else 

7: output "No Answer" 

8 : end if 

9: procedure FlNDGROUPANDLoc_SFGP(tnS'/,mS'fl,mQ/, qref) 

10: Si g- inSi, Sr g- iuSr, Qi g- inQi 

11 : while \Si\ + |S'_r| > p do 

12: if there is any unvisited vertex in Sr then 

13: Employ R-Tree distance browsing to extract from Sr the next unvisited vertex u 

which has the minimum spatial distance to qref 
14: mark u as visited 

15: else if 0 < p — 1 then 

16: increase 6 and mark the remaining vertices in Sr as unvisited 

17: else 

18: break 

19: end if 

20: if u satisfies the condition of Socio-Spatial Ordering in Eq. (1) then 

21: Si Si + {u}, Sr G- Sr — {u} 

22: if Familiarity Pruning in Eq. (4) or Eq. (5) is satisfied then 

23: break 

24: end if 

25: for all qi G Qi do 

26: if Si and qi satisfies Distance Pruning in Eq. (3) then 

27: Qi g- Qi — {qi} 

28: else if dy^q^ > t then 

29: Qi g- Qi — {q^} 

30: end if 

31: end for 

32: if Qi = 0 then 

33: break 

34: end if 

35: if [S'/! < p then 

36: ElNDGROUPANDLOC_SFGP(S'/,S'K,Q/,9re/) 

37: else 

38: if min,,eg, J2veSi ^ then 

39: q* G- argmin,,gQ, J2veSi dv,qi 

40: D G- dy^q», F Si 

41: end if 

42: break 

43: end if 

44: else if 6* = p — 1 then 

45: Sr Sr — {m} 

46: end if 

47: end while 

48: end procedure 







Algorithm 3 MAGS algorithm with APDO 


Require: Graph G = (V, E), location G for each v € V, the number of aftendees p, activity locations Q, familiarity constraint 
k, and spatial radius t. The user locations G G G are indexed by an R-Tree with root Mg. The activity locations qi £ Q 
are indexed by a BallTree with root Bq- 
Ensure: Optimal group F and the corresponding activity location q* 

1: Sj i — 0/ Sfi i — G, Qj i — Qf F i — 0, D i — OO, 9 i — k 
2: {F,q*) £- FlNDGROUPANDLOC_APDO(S'/,S'fi,(5/,Bo) 

3: if Z? OO then 
4: output {F,q*) 

5: else 

6: output "No Answer" 

7: end if 

8: procedure FlNDGROUPANDLoc_APDO(znS'/,mS'i{,zn(5/,zni3o) 

9: Si <r- inSi, Sr ■(— inSn, Qj ■(— inQi, Bq inBo 

10: while \Si\ + |S'fl| > p do 

11: if there is any unvisited vertex in Sr then 

12 : {vc,qref} APDOandDistPruning( Mq,Bq,S j,S r,Q j), u ^ Vc 

13: let Qi be the set of leaf nodes in BallTree which are neither pruned 

nor the descendants of a pruned ball 
14: mark u as visited 

15: else if 6* < p — 1 then 

16: increase 6 and mark the remaining vertices in Sr as unvisited 

17: else 

18: break 

19: end if 

20: if u satisfies the condition of Socio-Spatial Ordering in Eq. (1) then 

21: Si <— Si + {u}, Sr ■£- Sr — {u} 

22: if Familiarity Pruning in Eq. (4) or Eq. (5) is satisfied then 

23: break 

24: end if 

25: for all qi G Qi do 

26: if Si and qi satisfies Distance Pruning in Eq. (3) then 

27: mark qi as pruned, Qi Qi — {qi} 

28: else if € Si such that dy^q^ > t then 

29: mark qi as pruned, Qi Qi — {qi} 

30: end if 

31: end for 

32: if Qi = 0 then 

33: break 

34: end if 

35: if [S'/! < p then 

36: ElNDGROUPANDLOC_APDO(S'/,S'fl,(5/,So) 

37: else 

38: if min,,gQ, Y.veSi ^ 

39: q* argmin^.gQ, 

40: D 4— '^y^gj dy^q*, F Si 

41: end if 

42: break 

43: end if 

44: else if 6* = p — 1 then 

45: Sr Sr — {m} 

46: end if 

47: end while 

48: end procedure 
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APDOandDistPruning 

1: procedure APDOANDDlSTPRUNlNG(Mo,Bo,S'/,S'fi,(5/) 

2: let Ur and Ur be two lists 

3: M i — Mq, B i — Bq 

4: Ur <r- Mo, Ur <r- Bq 

5: Vc ^ ^, Qref ^ ^ 

6: while M and B are not both leaf nodes do 

7: pop MBR Mi from Ur and pop ball Bj from Ur such fhat 

ThveSi minDIST{ v, Bj) + MINDIST{Mi,Bj) is minimum, and Bj is not pruned 
8: M Sr- 'Mi, B Sr- Bj 

9: if M and B^ B satisfy OTDP Lemma 2 and prune By in Ur then 

10: remove By from Ur, mark By as pruned 

11: else if ITDP in Lemma 4 prunes B^ in Ur then 

12: remove B^ from Ur, mark B^ as pruned 

13: else if ALDP in Lemma 5 prunes B^ in Ur then 

14: remove B^ from Ur, mark B^ as pruned 

15: end if 

16: if M is not a leaf node then 

17: for all child MBR Mi of M do 

18: if Mi confains where v € Sr then 

19: push Mi into Ur 

20: end if 

21 : end for 

22 : end if 

23: if B is not a leaf node then 

24: for all child ball Bj of B do 

25: push Bj into Ur 

26: end for 

27: end if 

28: end while 

29: Vc Sr- M, Qref ^ B 

30: return {VcQref) 

31: end procedure 







